πŸ“– Document Q&A system with OpenAI GPT, Pinecone Vector DB & Google Drive integration

⚑ 1,275 views Β· πŸ“– Internal Wiki & Knowledge Base

Description

This workflow contains community nodes that are only compatible with the self-hosted version of n8n.

πŸ€– AI-Powered Document QA System using Webhook, Pinecone + OpenAI + n8n

This project demonstrates how to build a Retrieval-Augmented Generation (RAG) system using n8n, and create a simple Question Answer system using Webhook to connect with User Interface (created using Lovable):

🧾 Downloads the pdf file format documents from Google Drive (contract document, user manual, HR policy document etc…)

πŸ“š Converts them into vector embeddings using OpenAI

πŸ” Stores and searches them in Pinecone Vector DB

πŸ’¬ Allows natural language querying of contracts using AI Agents

πŸ“‚ Flow 1: Document Loading & RAG Setup

This flow automates:

Reading documents from a Google Drive folder

Vectorizing using text-embedding-3-small

Uploading vectors into Pinecone for later semantic search

🧱 Workflow Structure

A [Manual Trigger] β€”> B[Google Drive Search] B β€”> C[Google Drive Download] C β€”> D[Pinecone Vector Store] D β€”> E[Default Data Loader] E β€”> F[Recursive Character Text Splitter] E β€”> G[OpenAI Embedding]

πŸͺœ Steps

Manual Trigger: Kickstarts the workflow on demand for loading new documents.

Google Drive Search & Download

Node: Google Drive (Search: file/folder)

Downloads PDF documents

Apply Recursive Text Splitter: Breaks long documents into overlapping chunks

Settings: Chunk Size: 1000 Chunk Overlap: 100

OpenAI Embedding

Model: text-embedding-3-small Used for creating document vectors

Pinecone Vector Store

Host: url Index: index Batch Size: 200

Pinecone Settings:

Type: Dense Region: us-east-1 Mode: Insert Documents

πŸ’¬ Flow 2: Chat-Based Q&A Agent

This flow enables chat-style querying of stored documents using OpenAI-powered agents with vector memory.

🧱 Workflow Diagram

A[Webhook (chat message)] β€”> B[AI Agent] B β€”> C[OpenAI Chat Model] B β€”> D[Simple Memory] B β€”> E[Answer with Vector Store] E β€”> F[Pinecone Vector Store] F β€”> G[Embeddings OpenAI]

πŸͺœ Components

Chat (Trigger): Receives incoming chat queries

AI Agent Node

Handles query flow using:

Chat Model: OpenAI GPT

Memory: Simple Memory

Tool: Question Answer with Vector Store

Pinecone Vector Store: Connected via same embedding index as Flow 1

Embeddings: Ensures document chunks are retrievable using vector similarity

Response Node: Returns final AI response to user via webhook

🌐 Flow 3: UI-Based Query with Lovable

This flow uses a web UI built using Lovable to query contracts directly from a form interface.

πŸ“₯ Webhook Setup for Lovable

Webhook Node

Method: POST URL:url Response: Using β€˜Respond to Webhook’ Node

🧱 Workflow Logic

A[Webhook (Lovable Form)] β€”> B[AI Agent] B β€”> C[OpenAI Chat Model] B β€”> D[Simple Memory] B β€”> E[Answer with Vector Store] E β€”> F[Pinecone Vector Store] F β€”> G[Embeddings OpenAI] B β€”> H[Respond to Webhook]

πŸ’‘ Lovable UI

Users can submit:

Full Name Email Department Freeform Query: User can enter any freeform query. image.png Data is sent via webhook to n8n and responded with the answer from contract content.

πŸ” Use Cases

Contract Querying for Legal/HR teams

Procurement & Vendor Agreement QA

Customer Support Automation (based on terms)

RAG Systems for private document knowledge

βš™οΈ Tools & Tech Stack

image.png

πŸ“Œ Final Notes Pinecone Index: package1536

Dimension: 1536

Chunk Size: 1000, Overlap: 100

Embedding Model: text-embedding-3-small

Feel free to fork the workflow or request the full JSON export. Looking forward to your suggestions and improvements!

πŸ”— Nodes Used

Webhook, Google Drive, AI Agent, Embeddings OpenAI, OpenAI Chat Model, Simple Memory

πŸ“₯ Import

Download workflow.json and import into n8n: Workflow menu β†’ Import from File

πŸ“– Importing guide Β· πŸ”‘ Credential setup