🧾 Extract structured invoice data from JotForm PDFs with GPT-4.1-mini & Sheets

⚡ 182 views · 🧾 Invoice Processing

💡 Pro Tip — HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it — it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

Who this is for

This workflow is designed for Finance teams, accounting professionals, and automation engineers.

Use Case:

Automates processing of invoice submissions received via JotForm.

Ideal users include:

What problem this workflow solves

Manually extracting structured data from invoice PDFs submitted through JotForm is time-consuming, error-prone, and repetitive.

This workflow solves that by:

What this workflow does

  1. Webhook Trigger (JotForm → n8n) JotForm submission sends invoice data and attachment link to n8n.

  2. Parse Submission & Extract Metadata Extracts submission metadata (form ID, user details, invoice number, file link, etc.) using the Information Extractor node.

  3. Download PDF Attachment Fetches the uploaded PDF from JotForm’s secure file URL via the HTTP Request node, authenticated using a JotForm API key.

  4. Store & Process File Saves the invoice to disk and prepares it for AI processing.

  5. Extract Invoice Text Content Uses the Extract from File node to parse text from the PDF document.

  6. AI-Powered Structured Extraction (OpenAI GPT-4.1-mini) Sends the extracted text to a LangChain LLM Chain with a Structured Output Parser, ensuring consistent JSON output aligned with a defined schema.

  7. Save Extracted Data

    • Writes structured JSON to disk
    • Appends parsed results to Google Sheets for easy reporting

Setup Instructions

Prerequisites

You may build the invoice Jotform by leveraging the Jotform Templates

Invoice Upload Form

Steps

  1. Import the provided JSON into n8n

    • Go to n8n → Workflows → Import from File/Clipboard
    • Paste the provided JSON definition
  2. Configure Webhook

    • Copy the webhook URL from the Webhook node
    • Paste it into your JotForm’s Settings → Integrations → Webhook URL
  3. Set API Keys & Credentials

    • Ensure the Jotform API key has been setup to download the Jotform PDF document
    • Ensure your Google Sheets and OpenAI credentials are connected
  4. Test Submission

    • Submit your JotForm with an invoice PDF
    • n8n workflow will trigger automatically
  5. Check Outputs

    • Open your Google Sheet to see structured invoice entries
    • Check the disk folder (e.g., C:\Invoices) for JSON exports

How to customize this workflow

Summary

The Structured Invoice Data Extraction from JotForm PDFs via Google Gemini, Converts JotForm-uploaded invoice PDFs into structured financial data automatically.

Key Features:

🔗 Nodes Used

Function, Google Sheets, HTTP Request, Webhook, Basic LLM Chain, OpenAI Chat Model

📥 Import

Download workflow.json and import into n8n: Workflow menu → Import from File

📖 Importing guide · 🔑 Credential setup