🧾 Automated invoice data extraction with LlamaParse, Gemini 2.5 & Google Sheets

1,124 views · 🧾 Invoice Processing

💡 Pro Tip — HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it — it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

This n8n template demonstrates how to automate invoice data extraction from PDF attachments received via Gmail. Using LlamaParse and Gemini LLM, this workflow parses structured fields like PO numbers, line items, tax amounts, and totals — and stores them neatly into a Google Sheet.

Perfect for use cases such as: 💼 Finance teams managing vendor invoices 📊 Bookkeeping workflows 🔄 Automating monthly reconciliation

Good to Know

At the time of writing, LlamaParse and Gemini may involve API usage costs depending on your subscription tier. Check LlamaIndex Pricing and Gemini Pricing for updated info.

LlamaParse provides Markdown-formatted parsed output which is then passed to an LLM for structured field extraction.

Gemini models may be geo-restricted. If you encounter “model not found” errors, your region might not be supported.

How it Works

How to Use

The trigger is based on Gmail, but you can replace this with a webhook or manual trigger for testing.

Setup Instructions

Gmail API

Google Sheets

LlamaParse

Gemini (via Vertex AI)

Labeling

Requirements

Gmail account with API access

LlamaParse (LlamaIndex) account with API Key

Google Sheets API credentials

Access to Gemini 2.5 model via Google Vertex AI

Customising This Workflow

This template is just the beginning. You can expand it to:

🔗 Nodes Used

Google Sheets, HTTP Request, Gmail, Gmail Trigger, Basic LLM Chain, Structured Output Parser

📥 Import

Download workflow.json and import into n8n: Workflow menu → Import from File

📖 Importing guide · 🔑 Credential setup