πŸ”¬ Generate document summaries & Q&As from PDF/TXT using GPT-4o with Slack alerts

⚑ 193 views Β· πŸ”¬ Document Extraction & Analysis

Description

πŸ“˜ Description

This workflow automates document understanding by accepting uploaded PDF or TXT files, extracting their text, generating a structured summary and question–answer set using GPT-4o, validating the AI output, and returning a clean JSON response to the requester. It also sends an internal Slack preview and logs malformed outputs for debugging. It performs intelligent file-type detection, handles binary text extraction, enforces strict JSON formatting from the AI model, and ensures that the final response is clean, structured, and ready for use in downstream systems. All errorsβ€”missing text, invalid JSON, or malformed AI outputβ€”are captured automatically in Google Sheets. The workflow is designed as a plug-and-play document-analysis engine that converts any uploaded document into meaningful insights instantly.

βš™οΈ What This Workflow Does (Step-by-Step)

πŸ“₯ Receive Document Upload via Webhook Captures incoming files (PDF or TXT) posted to the webhook endpoint.

πŸ” Check If Uploaded File Is PDF / TXT Detects file extension and routes it correctly for extraction: PDF β†’ PDF extractor TXT β†’ text extractor Other file types are ignored.

πŸ“ Extract Text from Document Extracts readable text from PDF binaries Reads raw plain text from TXT files The extracted text becomes input for the AI analysis.

πŸ€– Generate Summary & Q&A Using AI Uses GPT-4o to produce: A 150–200 word summary Five structured Q&A pairs Output must strictly follow the specified JSON schema.

🧠 LLM Engine + Memory Context GPT-4o provides the reasoning engine Memory buffer maintains short context for stability Output parser ensures schema compliance

⚠️ Validate AI Output Before Processing Checks whether output is non-empty and correctly structured. Invalid β†’ logged to Google Sheets.

πŸ“Š Log Invalid AI Output to Google Sheet Records failures for audit, debugging, and retraining.

🧹 Unwrap AI Output Object Removes unnecessary array wrappers and normalizes the result.

πŸ“€ Prepare Final Response Payload Ensures the workflow responds with a single clean JSON object.

πŸ” Send Final Summary & Q&A Response to Webhook Returns the final structured JSON to the requesting system.

πŸ’¬ Send Summary Preview to Slack Shares a short preview (first 300 characters) for internal visibility.

🧩 Prerequisites

Webhook endpoint configured for uploads Azure OpenAI GPT-4o credentials Google Sheets OAuth connection Slack bot token

πŸ’‘ Key Benefits

βœ” Fully automated PDF/TXT understanding βœ” AI-powered summary + structured Q&A βœ” Strict JSON compliance for downstream systems βœ” Error-proof: logs all failures for investigation βœ” Slack visibility for quick internal review βœ” Works with minimal human involvement

πŸ‘₯ Perfect For

πŸ”— Nodes Used

Google Sheets, Slack, Webhook, AI Agent, Simple Memory, Structured Output Parser

πŸ“₯ Import

Download workflow.json and import into n8n: Workflow menu β†’ Import from File

πŸ“– Importing guide Β· πŸ”‘ Credential setup