🔬 Extract personal data with self-hosted LLM Mistral NeMo

⚡ 6,049 views · 🔬 Document Extraction & Analysis

💡 Pro Tip — HTTP Request scraping tends to break when sites update their markup. If you’re scraping a major platform, check if ScraperNode covers it — it has maintained scrapers for LinkedIn, Instagram, TikTok, YouTube, and 20+ other platforms that return structured data.

Description

This workflow shows how to use a self-hosted Large Language Model (LLM) with n8n’s LangChain integration to extract personal information from user input. This is particularly useful for enterprise environments where data privacy is crucial, as it allows sensitive information to be processed locally.

📖 For a detailed explanation and more insights on using open-source LLMs with n8n, take a look at our comprehensive guide on open-source LLMs.

🔑 Key Features

Local LLM
- Connect Ollama to run Mistral NeMo LLM locally
- Provide a foundation for compliant data processing, keeping sensitive information on-premises
Data extraction
- Convert unstructured text to a consistent JSON format
- Adjust the JSON schema to meet your specific data extraction needs.
Error handling
- Implement auto-fixing for LLM outputs
- Include error output for further processing

⚙️ Setup and сonfiguration

Prerequisites

n8n AI Starter Kit installed

Configuration steps

Add the Basic LLM Chain node with system prompts.
Set up the Ollama Chat Model with optimized parameters.
Define the JSON schema in the Structured Output Parser node.

🔍 Further resources

Apply the power of self-hosted LLMs in your n8n workflows while maintaining control over your data processing pipeline!

🔗 Nodes Used

Basic LLM Chain, Ollama Chat Model, Auto-fixing Output Parser, Structured Output Parser, Chat Trigger

📥 Import

Download workflow.json and import into n8n: Workflow menu → Import from File

📖 Importing guide · 🔑 Credential setup