πŸ’¬ Turn your website docs into a GPT-4.1-mini support chatbot with MrScraper and Pinecone

⚑ 3 views Β· πŸ’¬ Support Chatbots

πŸ’‘ Pro Tip β€” HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it β€” it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

Description

This n8n template turns any website or documentation portal into a fully functional AI-powered support chatbot β€” no manual copy-pasting, no static FAQs. It uses MrScraper to crawl and extract your site’s content, OpenAI to generate embeddings, and Pinecone to store and retrieve that knowledge at chat time.

The result is a retrieval-augmented chatbot that answers questions using only your actual website content, always cites its sources, and never hallucinates policies or pricing.


How It Works


How to Set Up

  1. Create 2 scrapers in your MrScraper account:

    • Map Agent Scraper (for crawling and discovering page URLs)
    • General Agent Scraper (for extracting title + content from each page)
    • Copy the scraperId for each β€” you’ll need these in n8n.
  2. Set up your Pinecone index:

    • Create a Pinecone index with dimensions that match your chosen OpenAI embedding model (e.g. 1536 for text-embedding-ada-002)
    • Choose a namespace (recommended format: docs-yourdomain)
  3. Add your credentials in n8n:

    • MrScraper API token
    • OpenAI API key (used for both embeddings and the chat model)
    • Pinecone API key
  4. Configure the Map Agent node:

    • Set your target domain or docs root URL (e.g. https://docs.yoursite.com)
    • Set includePatterns to focus on relevant sections (e.g. /docs/, /help/, /support/)
    • Optionally set excludePatterns to skip noise (e.g. /assets/, /tag/, /static/)
  5. Configure the General Agent node:

    • Enter your General Agent scraperId
    • Adjust the batch size in the SplitInBatches node (start with 1–5 to stay within rate limits)
  6. Configure the Pinecone nodes:

    • Select your Pinecone index in both the Upsert and Retriever nodes
    • Set the correct namespace in both nodes so indexing and retrieval use the same data
  7. Customise the chatbot system prompt:

    • Edit the Support Chat Agent’s system message to set the chatbot’s name, tone, and rules
    • Adjust topK in the Pinecone Retriever (default: 8) based on how much context you want per answer
  8. Connect your chat widget or frontend to the Chat Trigger webhook URL generated by n8n


Requirements


Good to Know


Customising This Workflow

πŸ”— Nodes Used

AI Agent, Embeddings OpenAI, OpenAI Chat Model, Simple Memory, Pinecone Vector Store, Default Data Loader

πŸ“₯ Import

Download workflow.json and import into n8n: Workflow menu β†’ Import from File

πŸ“– Importing guide Β· πŸ”‘ Credential setup