πŸ“Š Website content scraper & SEO keyword extractor with GPT-5-mini and Airtable

⚑ 22,813 views Β· πŸ“Š Market Research & Insights

πŸ’‘ Pro Tip β€” HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it β€” it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

This workflow allows you to scrape website content, clean the HTML, extract structured information using GPT-5-mini, and store the results along with SEO keywords into Airtable. Ideal for building keyword lists and organizing web content for SEO research.


Setup Instructions

1. Prerequisites


2. Airtable Structure

Ensure your Airtable table has the following fields:

Field NameTypeNotes
Website NameStringName or URL of the website
DataStringCleaned website text
KeywordStringExtracted SEO keyword list
StatusOptionsValues: Todo, In progress, Done

3. Node Setup

βœ… Form Trigger: Collects website URL from the user.

βœ… HTTP Request: Fetches the website content.

βœ… HTML Cleaner (Code Node): Strips out styles, tags, and whitespace to get clean text.

βœ… Topic Extractor (AI Agent + GPT-5-mini): Extracts topic-wise information from the cleaned website content.

βœ… Text Cleaner (Code Node): Removes unwanted symbols like ### and **.

βœ… Keyword Extractor (AI Agent + GPT-5-mini): Generates a list of 90 important SEO keywords.

βœ… Airtable Upsert: Stores the cleaned data, keywords, and status in Airtable.


4. Key Features

βœ… Automatic website content scraping βœ… Clean HTML and extract plain text βœ… Use GPT-5-mini for topic-wise information extraction βœ… Generate 90-keyword SEO lists βœ… Store and manage data in Airtable


5. Use Cases


Additional Workflow Recommendations

βœ… Rename Nodes for Clarity

Current NameSuggested Name
Website NameWebsite URL Input Form
HTTP RequestFetch Website Content
CodeHTML to Plain Text Cleaner
Split Out1Clean Text Splitter
AI Agent1Topic Extractor (GPT-5-mini)
Code1Text Cleanup Formatter
Split Out2Final Text Splitter
AI AgentKeyword Extractor (GPT-5-mini)
AirtableAirtable Data Upsert
Wait1Delay Before Merge
MergeCombine Data for Airtable

πŸ”— Nodes Used

Airtable, HTTP Request, AI Agent, OpenAI Chat Model, n8n Form Trigger

πŸ“₯ Import

Download workflow.json and import into n8n: Workflow menu β†’ Import from File

πŸ“– Importing guide Β· πŸ”‘ Credential setup