πŸ’¬ Analyze images & extract text with GPT-4o Vision and Telegram

⚑ 1,106 views Β· πŸ’¬ Support Chatbots

Description

image.png

Who’s it for

Teams and makers who want a plug-and-play vision bot: users send a photo in Telegram, the bot returns a concise description plus OCR text. No custom servers requiredβ€”just n8n, a Telegram bot, and an AIMLAPI key.

What it does / How it works

The workflow listens for new Telegram messages, fetches the highest-resolution photo, converts it to base64, normalizes the MIME type, and calls AIMLAPI (GPT-4o Vision) via the HTTP Request node using the OpenAI-compatible messages format with an image_url data URI. The model returns a short caption and extracted text. The answer is sent back to the same Telegram chat.

Requirements

How to set up

  1. Create a Telegram bot with @BotFather and copy the token.
  2. In n8n, add Telegram credentials (no hardcoded tokens in nodes).
  3. Add AIMLAPI credentials with your API key (base URL: https://api.aimlapi.com/v1).
  4. Import the workflow JSON and connect credentials in the nodes.
  5. Execute the trigger and send a photo to your bot to test.

How to customize the workflow

πŸ”— Nodes Used

HTTP Request, Telegram, Telegram Trigger, Extract from File

πŸ“₯ Import

Download workflow.json and import into n8n: Workflow menu β†’ Import from File

πŸ“– Importing guide Β· πŸ”‘ Credential setup