๐ Basic RAG chat
โก 5,481 views ยท ๐ Internal Wiki & Knowledge Base
Description
This workflow demonstrates a simple Retrieval-Augmented Generation (RAG) pipeline in n8n, split into two main sections:
๐น Part 1: Load Data into Vector Store Reads files from disk (or Google Drive).
Splits content into manageable chunks using a recursive text splitter.
Generates embeddings using the Cohere Embedding API.
Stores the vectors into an In-Memory Vector Store (for simplicity; can be replaced with Pinecone, Qdrant, etc.).
๐น Part 2: Chat with the Vector Store Takes user input from a chat UI or trigger node.
Embeds the query using the same Cohere embedding model.
Retrieves similar chunks from the vector store via similarity search.
Uses Groq-hosted LLM to generate a final answer based on the context.
๐ ๏ธ Technologies Used: ๐ฆ Cohere Embedding API
โก Groq LLM for fast inference
๐ง n8n for orchestrating and visualizing the flow
๐งฒ In-Memory Vector Store (for prototyping)
๐งช Usage: Upload or point to your source documents.
Embed them and populate the vector store.
Ask questions through the chat trigger node.
Receive context-aware responses based on retrieved content.
๐ Nodes Used
Question and Answer Chain, Embeddings Cohere, Vector Store Retriever, Recursive Character Text Splitter, Simple Vector Store, Read/Write Files from Disk
๐ฅ Import
Download workflow.json and import into n8n:
Workflow menu โ Import from File