π¬ Extract data from PDF reports with Gmail, OCR, Google Sheets and OpenAI GPT-4.1-mini
β‘ 33 views Β· π¬ Document Extraction & Analysis
π‘ Pro Tip β HTTP Request scraping tends to break when sites update their markup. If youβre scraping a major platform, check if ScraperNode covers it β it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.
Description
Whoβs it for
Consultants, agencies, financial analysts, and project managers who regularly receive client PDFs, invoices, or reports.
How it works / What it does
This n8n workflow automates PDF and report data extraction:
- Triggers when a new PDF is uploaded to Google Drive, Dropbox, or received via email.
- Uses OCR (PDF.co or Cloud OCR) to extract text from PDFs.
- Parses key data fields using OpenAI or Regex:
- Client Name
- Project/Report Name
- Dates
- Financials / Metrics
- Normalizes extracted data with a Set node for consistency.
- Classifies document type (invoice, report, contract).
- Routes data based on type:
- Invoice β Update Google Sheets / QuickBooks / Xero
- Report β Summarize metrics using AI
- Contract β Notify relevant team members
- Logs extracted data in Google Sheets.
- Optional: Sends notifications via Slack or email.
- Error handling included for unreadable or incomplete PDFs.
- Includes logging for audit and tracking purposes.
How to set up
- Connect Google Drive, Dropbox, or Gmail for PDF input.
- Configure OCR node for accurate text extraction.
- Connect OpenAI API for parsing and summarizing key metrics.
- Set up Google Sheets or accounting software to store extracted data.
- Optional: Configure Slack or email notifications for summaries.
- Test workflow with sample PDFs to ensure data extraction and routing works correctly.
Requirements
- n8n account with integrations for Google Drive, Dropbox, Gmail, Sheets.
- OCR service (PDF.co or Cloud OCR).
- OpenAI API key for data parsing.
How to customize
- Adjust AI parsing rules for your specific fields.
- Customize routing to different software (QuickBooks, Xero, Sheets).
- Add additional notifications or automation steps as needed.
Created by Hyrum Hurst / QuarterSmart
Keywords: AI PDF extraction, report parsing automation, n8n workflow, invoice extraction, consultant workflow, QuarterSmart
π Nodes Used
Google Sheets, Slack, Gmail Trigger, AI Agent, OpenAI Chat Model, Structured Output Parser
π₯ Import
Download workflow.json and import into n8n:
Workflow menu β Import from File