π Sitemap.xml metadata export to Google Sheets (Title, Meta Description, and URLs)
β‘ 276 views Β· π Market Research & Insights
π‘ Pro Tip β HTTP Request scraping tends to break when sites update their markup. If youβre scraping a major platform, check if ScraperNode covers it β it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.
Description
This workflow automates the extraction of SEO metadata (URL, page title, and meta description) from every page listed in your websiteβs sitemap and exports it to Google Sheets. Ideal for SEO audits, content inventories, and tracking on-page elements.
Prerequisites
Before using this workflow:
- A publicly accessible
sitemap.xmlURL - Google Sheets spreadsheet with columns: URL, Title, and meta description
- Google Sheets API access via OAuth2
Setup Instructions
1. Configure Sitemap Source
- In the βGet Sitemap XMLβ node, replace the default URL with your actual sitemap URL
2. Connect Google Sheets
- Open the βAppend or update row in sheetβ node
- Configure Google Sheets credentials
- Set Document ID and Sheet Name
- Verify column mappings match your spreadsheet
3. Adjust Rate Limiting (Optional)
- Modify Wait nodes if encountering 429 errors
- Increase delay between requests if needed
How It Works
- Trigger: Manual workflow execution
- Sitemap Fetch: Retrieve sitemap.xml file
- URL Parsing: Extract all URLs from sitemap
- Batch Processing: Process URLs in manageable batches
- Data Extraction: Scrape title and meta description from each page
- Data Merge: Combine URL with extracted metadata
- Sheet Update: Append or update rows in Google Sheets using URL as a unique key
Features
- Duplicate Prevention: Uses
appendOrUpdatewith URL matching - Rate Limiting: Built-in delays between requests
- Flexible Processing: Handles sitemaps of various sizes
- Easy Customization: Modify code nodes for additional data extraction
Use Cases
- SEO audits of title and description tags
- Content migration planning
- Website content inventory management
- Ongoing SEO monitoring and reporting
π Nodes Used
Google Sheets, HTTP Request
π₯ Import
Download workflow.json and import into n8n:
Workflow menu β Import from File