๐Ÿ”ฌ Extract Arabic text from PDFs with Mistral OCR, Telegram Bot & Google Docs

โšก 507 views ยท ๐Ÿ”ฌ Document Extraction & Analysis

Description

Arabic OCR Telegram Bot

How it Works

  1. Receive PDF Files - Users send PDF documents via Telegram to the bot
  2. OCR Processing - Mistral AIโ€™s OCR service extracts Arabic text from document pages
  3. Text Organization - Processes and formats extracted content with page numbers
  4. Create Google Doc - Generates a formatted document with all extracted text
  5. Deliver Results - Sends users a clickable link to their processed document

Set up Steps

Setup Time: ~20 minutes

  1. Create Telegram Bot - Get bot token from @BotFather on Telegram
  2. Configure APIs - Set up Mistral AI OCR and Google Docs API credentials
  3. Set Folder Permissions - Create Google Drive folder for storing results
  4. Test Bot - Send a sample Arabic PDF to verify OCR accuracy
  5. Deploy Webhook - Activate the Telegram webhook for real-time processing

Detailed API configuration and Arabic text handling notes are included as sticky notes within the workflow.


What Youโ€™ll Need:

Key Features:

๐Ÿ”— Nodes Used

HTTP Request, Telegram, Telegram Trigger, Google Docs

๐Ÿ“ฅ Import

Download workflow.json and import into n8n: Workflow menu โ†’ Import from File

๐Ÿ“– Importing guide ยท ๐Ÿ”‘ Credential setup