π Document Q&A system with OpenAI GPT, Pinecone Vector DB & Google Drive integration
β‘ 1,275 views Β· π Internal Wiki & Knowledge Base
Description
This workflow contains community nodes that are only compatible with the self-hosted version of n8n.
π€ AI-Powered Document QA System using Webhook, Pinecone + OpenAI + n8n
This project demonstrates how to build a Retrieval-Augmented Generation (RAG) system using n8n, and create a simple Question Answer system using Webhook to connect with User Interface (created using Lovable):
π§Ύ Downloads the pdf file format documents from Google Drive (contract document, user manual, HR policy document etcβ¦)
π Converts them into vector embeddings using OpenAI
π Stores and searches them in Pinecone Vector DB
π¬ Allows natural language querying of contracts using AI Agents
π Flow 1: Document Loading & RAG Setup
This flow automates:
Reading documents from a Google Drive folder
Vectorizing using text-embedding-3-small
Uploading vectors into Pinecone for later semantic search
π§± Workflow Structure
A [Manual Trigger] β> B[Google Drive Search] B β> C[Google Drive Download] C β> D[Pinecone Vector Store] D β> E[Default Data Loader] E β> F[Recursive Character Text Splitter] E β> G[OpenAI Embedding]
πͺ Steps
Manual Trigger: Kickstarts the workflow on demand for loading new documents.
Google Drive Search & Download
Node: Google Drive (Search: file/folder)
Downloads PDF documents
Apply Recursive Text Splitter: Breaks long documents into overlapping chunks
Settings: Chunk Size: 1000 Chunk Overlap: 100
OpenAI Embedding
Model: text-embedding-3-small Used for creating document vectors
Pinecone Vector Store
Host: url Index: index Batch Size: 200
Pinecone Settings:
Type: Dense Region: us-east-1 Mode: Insert Documents
π¬ Flow 2: Chat-Based Q&A Agent
This flow enables chat-style querying of stored documents using OpenAI-powered agents with vector memory.
π§± Workflow Diagram
A[Webhook (chat message)] β> B[AI Agent] B β> C[OpenAI Chat Model] B β> D[Simple Memory] B β> E[Answer with Vector Store] E β> F[Pinecone Vector Store] F β> G[Embeddings OpenAI]
πͺ Components
Chat (Trigger): Receives incoming chat queries
AI Agent Node
Handles query flow using:
Chat Model: OpenAI GPT
Memory: Simple Memory
Tool: Question Answer with Vector Store
Pinecone Vector Store: Connected via same embedding index as Flow 1
Embeddings: Ensures document chunks are retrievable using vector similarity
Response Node: Returns final AI response to user via webhook
π Flow 3: UI-Based Query with Lovable
This flow uses a web UI built using Lovable to query contracts directly from a form interface.
π₯ Webhook Setup for Lovable
Webhook Node
Method: POST URL:url Response: Using βRespond to Webhookβ Node
π§± Workflow Logic
A[Webhook (Lovable Form)] β> B[AI Agent] B β> C[OpenAI Chat Model] B β> D[Simple Memory] B β> E[Answer with Vector Store] E β> F[Pinecone Vector Store] F β> G[Embeddings OpenAI] B β> H[Respond to Webhook]
π‘ Lovable UI
Users can submit:
Full Name Email Department Freeform Query: User can enter any freeform query. image.png Data is sent via webhook to n8n and responded with the answer from contract content.
π Use Cases
Contract Querying for Legal/HR teams
Procurement & Vendor Agreement QA
Customer Support Automation (based on terms)
RAG Systems for private document knowledge
βοΈ Tools & Tech Stack
image.png
π Final Notes Pinecone Index: package1536
Dimension: 1536
Chunk Size: 1000, Overlap: 100
Embedding Model: text-embedding-3-small
Feel free to fork the workflow or request the full JSON export. Looking forward to your suggestions and improvements!
π Nodes Used
Webhook, Google Drive, AI Agent, Embeddings OpenAI, OpenAI Chat Model, Simple Memory
π₯ Import
Download workflow.json and import into n8n:
Workflow menu β Import from File