📊 Ai website scraper & company intelligence

⚡ 1,311 views · 📊 Market Research & Insights

💡 Pro Tip — HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it — it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

AI Website Scraper & Company Intelligence

Description

This workflow automates the process of transforming any website URL into a structured, intelligent company profile.
It’s triggered by a form, allowing a user to submit a website and choose between a “basic” or “deep” scrape.

The workflow extracts key information (mission, services, contacts, SEO keywords), stores it in a structured Supabase database, and archives a full JSON backup to Google Drive.
It also features a secondary AI agent that automatically finds and saves competitors for each company, building a rich, interconnected database of company intelligence.


Quick Implementation Steps

  1. Import the Workflow: Import the provided JSON file into your n8n instance.

  2. Install Custom Community Node:
    You must install the community node from:
    https://www.npmjs.com/package/n8n-nodes-crawl-and-scrape FIRECRAWL N8N Documentation https://docs.firecrawl.dev/developer-guides/workflow-automation/n8n

  3. Install Additional Nodes:
    n8n-nodes-crawl-and-scrape and n8n-nodes-mcp fire crawl mcp .

  4. Set up Credentials:
    Create credentials in n8n for FIRE CRAWL API,Supabase, Mistral AI, and Google Drive.

  5. Configure API Key (CRITICAL):

    • Open the Web Search tool node.
    • Go to Parameters → Headers and replace the hardcoded Tavily AI API key with your own.
  6. Configure Supabase Nodes:

    • Assign your Supabase credential to all Supabase nodes.
    • Ensure table names (e.g., companies, competitors) match your schema.
  7. Configure Google Drive Nodes:

    • Assign your Google Drive credential to the Google Drive2 and save to Google Drive1 nodes.
    • Select the correct Folder ID.
  8. Activate Workflow:
    Turn on the workflow and open the Webhook URL in the “On form submission” node to access the form.


What It Does

Form Trigger

Captures user input: “Website URL” and “Scraping Type” (basic or deep).

Scraping Router

A Switch node routes the flow:

Deep Scraping (Firecrawl AI Agent)

Basic Scraping (Crawlee)

Data Storage

Automated Competitor Analysis


Who’s It For


Requirements


How It Works

Flow Summary

  1. Form Trigger: Captures “Website URL” and “Scraping Type”.
  2. Switch Node:
    • deep → MCP Firecrawler (AI Agent).
    • basic → Crawl and Scrape node.
  3. Scraping & Extraction:
    • Deep path: Firecrawler → JSON structure.
    • Basic path: Crawlee → Mistral extractor → JSON.
  4. Storage:
    • Save JSON to Supabase.
    • Archive in Google Drive.
  5. Competitor Analysis (Deep Only):
    • Finds competitors via Tavily.
    • Saves to Supabase competitors table.
  6. End: Finishes with a No Operation node.

How To Set Up

  1. Import workflow JSON.
  2. Install community nodes (especially n8n-nodes-crawl-and-scrape from npm).
  3. Configure credentials (Supabase, Mistral AI, Google Drive).
  4. Add your Tavily API key.
  5. Connect Supabase and Drive nodes properly.
  6. Fix disconnected “basic” path if needed.
  7. Activate workflow.
  8. Test via the webhook form URL.

How To Customize


Add-ons


Use Case Examples


WORKFLOW IMAGE

Screenshot_22102025_152855_localhost.jpeg

Troubleshooting Guide

IssuePossible CauseSolution
Form Trigger 404Workflow not activeActivate the workflow
Web Search Tool failsMissing Tavily API keyReplace the placeholder key
FIRECRAWLER / find competitor failsMissing MCP nodeInstall n8n-nodes-mcp
Basic scrape does nothingSwitch node path disconnectedReconnect “basic” output
Supabase node errorWrong table/column namesMatch schema exactly

Need Help or More Workflows?

Want to customize this workflow for your business or integrate it with your existing tools?
Our team at Digital Biz Tech can tailor it precisely to your use case from automation logic to AI-powered enhancements.

Contact: rajeet.nair@digitalbiz.tech
For more such offerings, visit us: https://www.digitalbiz.tech


đź”— Nodes Used

Google Drive, Supabase, AI Agent, Structured Output Parser, n8n Form Trigger, Convert to File

📥 Import

Download workflow.json and import into n8n: Workflow menu → Import from File

📖 Importing guide · 🔑 Credential setup