📊 Web research assistant: automated search & scraping with Gemini AI and spreadsheet reports

⚡ 1,227 views · 📊 Market Research & Insights

💡 Pro Tip — HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it — it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

⚠️ IMPORTANT: This template requires self-hosted n8n hosting due to the use of community nodes (MCP tools). It will not work on n8n Cloud. Make sure you have access to a self-hosted n8n instance before using this template.

Overview

Screenshot 20250905 103811.png

This workflow automation allows a Google Gemini-powered AI Agent to orchestrate multi-source web intelligence using MCP (Model Context Protocol) tools such as Firecrawl, Brave Search, and Apify.

The system allows users to interact with the agent in natural language, which then leverages various external data collection tools, processes the results, and automatically organizes them into structured spreadsheets.

With built-in memory, flexible tool execution, and conversational capabilities, this workflow acts as a multi-agent research assistant, capable of retrieving, synthesizing, and delivering actionable insights in real time.

How the system works

AI Agent + MCP Pipeline

  1. User Interaction A chat message is received and forwarded to the AI Agent.

  2. AI Orchestration The agent, powered by Google Gemini, decides which MCP tools to invoke based on the query.

    • Firecrawl-MCP: Recursive web crawling and content extraction.
    • Brave-MCP: Real-time web search with structured results.
    • Apify-MCP: Automation of web scraping tasks with scalable execution.
  3. Memory Management A memory module stores context across conversations, ensuring multi-turn reasoning and task continuity.

  4. Spreadsheet automation Results are structured in a new, automatically created Google Spreadsheet, enriched with formatting and additional metadata.

  5. Data processing The workflow generates the spreadsheet content, updates the sheet, and improves results via HTTP requests and field edits.

  6. Delivery of results Users receive a structured and contextualized dataset ready for review, analysis, or integration into other systems.

Configuration instructions

Estimated setup time: 45 minutes

Prerequisites

Detailed configuration steps

Step 1: Configuring the AI Agent

Step 2: Integrating MCP Tools

Step 3: Spreadsheet automation

Step 4: Post-processing and delivery

Structure of generated Google Sheets

Default columns

ColumnDescriptionType
URLData source URLHyperlink
TitlePage/resource titleText
DescriptionDescription or content excerptLong text
SourceMCP tool used (Brave/Firecrawl/Apify)Text
TimestampDate/time of collectionDate/Time
MetadataAdditional data (JSON)Text

Automatic formatting

Use cases

Business and enterprise

Research and academia

Engineering and development

Personal productivity

Key features

Multi-source intelligence

AI-driven orchestration

Structured data output

Performance and scalability

Security and privacy

Technical architecture

Workflow

User query → AI agent (Gemini) → MCP tools (Firecrawl / Brave / Apify) → Aggregated results → Spreadsheet creation → Data processing → Results delivery

Supported data types

Integration options

Chat interfaces

Data sources

Performance specifications

Advanced configuration options

Customization

Analytics and monitoring

Troubleshooting and support

đź”— Nodes Used

Google Sheets, HTTP Request, AI Agent, Simple Memory, Chat Trigger, Google Gemini Chat Model

📥 Import

Download workflow.json and import into n8n: Workflow menu → Import from File

📖 Importing guide · 🔑 Credential setup