πŸ”¬ Scrape and analyze websites with custom prompts using Gemini, Apify, and LangChain

⚑ 2,286 views Β· πŸ”¬ Document Extraction & Analysis

πŸ’‘ Pro Tip β€” Job boards are notoriously hard to scrape β€” CAPTCHAs, rate limits, constantly changing layouts. ScraperNode has maintained scrapers for Indeed jobs, Glassdoor reviews, and Glassdoor jobs that handle all of that for you.

View All Scrapers

Description

πŸ” AI-Powered Website Prompt Executor (Apify + OpenRouter)

This workflow combines the power of Apify and OpenRouter to scrape website content and execute any custom prompt using AI. You define what you want β€” whether it’s extracting contact details, summarizing content, collecting job offers, or anything else β€” and the system intelligently processes the site to give you results.

πŸš€ Overview

This workflow allows you to:

  1. Input a URL and define a prompt.
  2. Scrape the specified number of pages from the website.
  3. Process each page’s metadata and Markdown content.
  4. Use AI to interpret and respond to the prompt on each page.
  5. Aggregate and return structured output.

🧠 How It Works

Input Example

{
  "enqueue": true,
  "maxPages": 5,
  "url": "https://apify.com",
  "method": "GET",
  "prompt": "collect all contact informations available on this website"
}

Workflow Steps

StepAction
1Triggered by another workflow with JSON input.
2Calls the Apify actor firescraper-ai-website-content-markdown-scraper to scrape content.
3Loops through the scraped pages.
4AI analyzes each page based on the input prompt.
5Aggregates AI outputs across all pages.
6Final AI processing step to return a clean structured result.

πŸ›  Technologies Used


πŸ”§ Customization

Customize the workflow via the following input fields:

This allows dynamic, flexible use across various use cases.


πŸ“¦ Output

The workflow returns a JSON result that includes:


πŸ§ͺ Example Use Cases


πŸ” API Credentials Required

You will need:

Set these credentials in your environment or n8n credential manager before running.

πŸ”— Nodes Used

HTTP Request, Execute Workflow Trigger, AI Agent, OpenRouter Chat Model

πŸ“₯ Import

Download workflow.json and import into n8n: Workflow menu β†’ Import from File

πŸ“– Importing guide Β· πŸ”‘ Credential setup