πŸ“– Create a RAG system with Paul Essays, Milvus, and OpenAI for cited answers

⚑ 2,250 views Β· πŸ“– Internal Wiki & Knowledge Base

πŸ’‘ Pro Tip β€” HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it β€” it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

Create a RAG System with Paul Essays, Milvus, and OpenAI for Cited Answers

This workflow automates the process of creating a document-based AI retrieval system using Milvus, an open-source vector database. It consists of two main steps:

  1. Data collection/processing
  2. Retrieval/response generation

The system scrapes Paul Graham essays, processes them, and loads them into a Milvus vector store. When users ask questions, it retrieves relevant information and generates responses with citations.

Step 1: Data Collection and Processing

  1. Set up a Milvus server using the official guide
  2. Create a collection named β€œmy_collection”
  3. Execute the workflow to scrape Paul Graham essays:
    • Fetch essay lists
    • Extract names
    • Split content into manageable items
    • Limit results (if needed)
    • Fetch texts
    • Extract content
    • Load everything into Milvus Vector Store

This step uses OpenAI embeddings for vectorization.

Step 2: Retrieval and Response Generation

When a chat message is received, the system:

This process uses OpenAI embeddings and models to ensure accurate and relevant answers with proper citations.

For more information on vector databases and similarity search, visit Milvus documentation.

πŸ”— Nodes Used

HTTP Request, Embeddings OpenAI, OpenAI Chat Model, Recursive Character Text Splitter, Default Data Loader, Chat Trigger

πŸ“₯ Import

Download workflow.json and import into n8n: Workflow menu β†’ Import from File

πŸ“– Importing guide Β· πŸ”‘ Credential setup