🧾 Extract and organize Colombian invoices with Gmail, GPT-4o & Google Workspace
⚡ 3,198 views · 🧾 Invoice Processing
💡 Pro Tip — HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it — it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.
Description
🧾 Personal Invoice Processor
This N8N workflow automates the extraction and organization of personal invoices in Colombia received via Gmail. It includes the following key steps:
🔁 Flow Summary
-
Email Trigger
- Polls Gmail every 30 minutes for emails with
.zipattachments (assumed to contain invoices). - Expects ZIP file following DIAN standards.
- Polls Gmail every 30 minutes for emails with
-
ZIP File Handling
- Extracts all files.
- Filters only PDF and XML files for processing.
-
Data Extraction & Processing
- Uses LangChain Agent + OpenAI (GPT-4o-mini) to extract:
- Tipo de documento (Factura / Nota Crédito)
- Número de factura
- Fecha de emisión (YYYY-MM-DD)
- NIT emisor y receptor (sin dígito de verificación)
- Razón social del emisor
- Subtotal, IVA, Total
- CUFE
- Resumen de compra (max 20 words, formatted sentence)
- Uses LangChain Agent + OpenAI (GPT-4o-mini) to extract:
-
Validation
- Ensures Total = Subtotal + IVA using a calculator node.
-
Storage
- Uploads the original PDF to Google Drive.
- Renames the file to:
YYYY-MM-DD-NUMERO_FACTURA.pdf. - Inserts or updates invoice details in Google Sheets using a unique
Key(NIT_Emisor + Numero_Factura) to prevent duplication.
> ⚙️ Designed for personal use with minimal latency tolerance and high automation reliability.
🔗 Nodes Used
Google Sheets, Google Drive, Gmail Trigger, Filter, AI Agent, OpenAI Chat Model
📥 Import
Download workflow.json and import into n8n:
Workflow menu → Import from File