✨ Multi-modal expense tracking with GPT-4, Gemini OCR, and voice via Telegram

968 views · ✨ AI & LLMs

💡 Pro Tip — HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it — it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

View All Scrapers

Description

This n8n template creates an intelligent expense tracking system 🤖 that processes text, voice, and receipt images through Telegram. The assistant automatically categorizes expenses, handles currency conversions 🌍, and maintains financial records in Google Sheets while providing smart spending insights 💡.

Use Cases:

How it works:

  1. Multi-Input Processing: Telegram trigger captures text messages, voice notes, and receipt images.
  2. Content Analysis: A Switch node routes different input types (text, audio, images) to appropriate processors.
  3. Voice Processing: ElevenLabs converts voice messages to text for expense extraction.
  4. Receipt OCR: Google Gemini analyzes receipt images to extract amounts and descriptions.
  5. Expense Classification: An LLM determines if the input is an expense or a general query.
  6. Expense Parsing: For multiple expenses, the AI splits and normalizes each item.
  7. Currency Conversion: An exchange rate API converts foreign currencies to USD.
  8. Smart Categorization: The AI agent assigns expenses to predefined categories with emojis.
  9. Data Storage: Google Sheets stores all expense records with automatic totals.
  10. Intelligent Responses: The agent provides spending summaries, alerts, and financial insights.

Requirements:

Good to know:

Customizing this workflow:

🔗 Nodes Used

HTTP Request, Telegram, Telegram Trigger, AI Agent, Basic LLM Chain, Calculator

📥 Import

Download workflow.json and import into n8n: Workflow menu → Import from File

📖 Importing guide · 🔑 Credential setup