📖 Build a RAG system with automatic citations using Qdrant, Gemini & OpenAI

⚡ 2,970 views · 📖 Internal Wiki & Knowledge Base

Description

This workflow implements a Retrieval-Augmented Generation (RAG) system that:

Create Qdrant Collection A REST API node creates a new collection in Qdrant with specified vector size (1536) and cosine similarity.
Load Files from Google Drive The workflow lists all files in a Google Drive folder, downloads them as plain text, and loops through each.
Text Preprocessing & Embedding
- Documents are split into chunks (500 characters, with 50-character overlap).
- Embeddings are created using OpenAI embeddings (text-embedding-3-small assumed).
- Metadata (file name and ID) is attached to each chunk.
Store in Qdrant All vectors, along with metadata, are inserted into the Qdrant collection.
Chat Input & Retrieval
- When a chat message is received, the question is embedded and matched against Qdrant.
- Top 5 relevant document chunks are retrieved.
- A Gemini model is used to generate the answer based on those sources.
Source Aggregation & Response
- File IDs and names are deduplicated.
- The AI response is combined with a list of cited documents (filenames).
- Final output:
```
AI Response

Sources: ["Document1", "Document2"]
```

End-to-end Automation: From document ingestion to chat response generation, fully automated with no manual steps.
Scalable Knowledge Base: Easy to expand by simply adding files to the Google Drive folder.
Traceable Responses: Each answer includes its source files, increasing transparency and trustworthiness.
Modular Design: Each step (embedding, storage, retrieval, response) is isolated and reusable.
Multi-provider AI: Combines OpenAI (for embeddings) and Google Gemini (for chat), optimizing performance and flexibility.
Secure & Customizable: Uses API credentials and configurable chunk size, collection name, etc.

Document Processing & Vectorization
- The workflow retrieves documents from a specified Google Drive folder.
- Each file is downloaded, split into chunks (using a recursive text splitter), and converted into embeddings via OpenAI.
- The embeddings, along with metadata (file ID and name), are stored in a Qdrant vector database under the collection negozio-emporio-verde.
Query Handling & Response Generation
- When a user submits a chat message, the workflow:
  - Embeds the query using OpenAI.
  - Retrieves the top 5 relevant document chunks from Qdrant.
  - Uses Google Gemini to generate a response based on the retrieved context.
  - Aggregates and deduplicates the source file names from the retrieved chunks.
- The final output includes both the AI-generated response and a list of source documents (e.g., Sources: ["FAQ.pdf", "Policy.txt"]).

Configure Qdrant Collection
- Replace QDRANTURL and COLLECTION in the “Create collection” HTTP node to initialize the Qdrant collection with:
  - Vector size: 1536 (OpenAI embedding dimension).
  - Distance metric: Cosine.
- Ensure the “Clear collection” node is configured to reset the collection if needed.
Google Drive & OpenAI Integration
- Link the Google Drive node to the target folder (Test Negozio in this example).
- Verify OpenAI and Google Gemini API credentials are correctly set in their respective nodes.
Metadata & Output Customization
- Adjust the “Aggregate” and “Response” nodes if additional metadata fields are needed.
- Modify the “Output” node to format the response (e.g., changing Sources: {{...}} to match your preferred style).
Testing
- Trigger the workflow manually to test document ingestion.
- Use the chat interface to verify responses include accurate source attribution.

Note: Replace placeholder values (e.g., QDRANTURL) with actual endpoints before deployment.

Contact me for consulting and support or add me on Linkedin.

HTTP Request, Google Drive, Question and Answer Chain, Embeddings OpenAI, Vector Store Retriever, Recursive Character Text Splitter

Download workflow.json and import into n8n: Workflow menu → Import from File