๐ โจ Vision-based AI agent scraper - with Google Sheets, ScrapingBee, and Gemini
โก 37,286 views ยท ๐ Market Research & Insights
๐ก Pro Tip โ HTTP Request scraping tends to break when sites update their markup. If youโre scraping a major platform, check if ScraperNode covers it โ it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.
Description
Important Notes:
Check Legal Regulations:
This workflow involves scraping, so ensure you comply with the legal regulations in your country before getting started. Better safe than sorry!
Workflow Description:
๐ฎโ๐จ Tired of struggling with XPath, CSS selectors, or DOM specificity when scraping ?
This AI-powered solution is here to simplify your workflow! With a vision-based AI Agent, you can extract data effortlessly without worrying about how the DOM is structured.
This workflow leverages a vision-based AI Agent, integrated with Google Sheets, ScrapingBee, and the Gemini-1.5-Pro model, to extract structured data from webpages. The AI Agent primarily uses screenshots for data extraction but switches to HTML scraping when necessary, ensuring high accuracy.
Key Features:
- Google Sheets Integration: Manage URLs to scrape and store structured results.
- ScrapingBee: Capture full-page screenshots and retrieve HTML data for fallback extraction.
- AI-Powered Data Parsing: Use Gemini-1.5-Pro for vision-based scraping and a Structured Output Parser to format extracted data into JSON.
- Token Efficiency: HTML is converted to Markdown to optimize processing costs.
This template is designed for e-commerce scraping but can be customized for various use cases.
๐ Nodes Used
Google Sheets, HTTP Request, Markdown, Execute Workflow Trigger, AI Agent, Structured Output Parser
๐ฅ Import
Download workflow.json and import into n8n:
Workflow menu โ Import from File