📊 Advanced multi-source AI research with Bright Data, OpenAI, Redis

⚡ 173 views · 📊 Market Research & Insights

💡 Pro Tip — HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it — it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

How it Works

This workflow transforms natural language queries into research reports through a five-stage AI pipeline. When triggered via webhook (typically from Google Sheets using the companion google-apps-script.js (GitHub gist), it first checks Redis cache for instant results.

For new queries, GPT-4o breaks complex questions into focused sub-queries, optimizes them for search, then uses Bright Data’s MCP Tool to find the top 5 credible sources (official sites, news, financial reports). URLs are scraped in parallel, bypassing bot detection.

GPT-4o extracts structured data from each source: answers, facts, entities, sentiment, quotes, and dates. GPT-4o-mini validates source credibility and filters unreliable content. Valid results aggregate into a final summary with confidence scores, key insights, and extended analysis.

Results cache for 1 hour and output via webhook, Slack, email, and DataTable—all in 30-90 seconds with 60 requests/minute rate limiting.


Who is this for?


Setup Steps

Setup time: 30-45 minutes

Requirements:

Core Setup:

  1. Get Bright Data Web Scraping API token and MCP token
  2. Get OpenAI API key
  3. Set up Redis instance
  4. Configure critical nodes:
    • Webhook Entry: Add Header Auth token
    • Bright Data MCP Tool: Add MCP endpoint with token
    • Parallel Web Scraping: Add Bright Data API credentials
    • Redis Nodes: Add connection credentials
    • All GPT Nodes: Add OpenAI API key (5 nodes)
    • Slack/Email: Add credentials if using

Google Sheets Integration:

  1. Create Google Sheet
  2. Open Extensions → Apps Script
  3. Paste the companion google-apps-script.js code
  4. Update webhook URL and auth token
  5. Save and authorize

Test: {"prompt": "What is the population of Tokyo?", "source": "Test", "language": "English"}


Customization Guidance


Once configured, this workflow handles all web research, from fact-checking to complex analysis—delivering validated intelligence in seconds with automatic caching.


Built by Daniel Shashko
Connect on LinkedIn

🔗 Nodes Used

Send Email, HTTP Request, Redis, Slack, Webhook, AI Agent

📥 Import

Download workflow.json and import into n8n: Workflow menu → Import from File

📖 Importing guide · 🔑 Credential setup