π¬ Beginner AI dataset generator using OpenAI + LangChain in n8n
β‘ 2,003 views Β· π¬ Document Extraction & Analysis
Description
This n8n workflow dynamically generates a realistic sample dataset based on a single topic you provide. It uses OpenAI (via LangChain) and n8nβs built-in nodes to:
- Generate structured JSON data for 5 columns with 3β5 values each
- Flatten that data into a single text blob
- Infer meaningful column names via a second AI call
- Pivot, split, merge, and rename columns automatically
- Output a clean, labeled dataset ready for export or further processing
βοΈ Prerequisites
-
OpenAI API Key
- Visit: https://platform.openai.com/account/api-keys
- Create a new key
- In n8n: Credentials β New β OpenAI API, paste key, name it βOpenAi accountβ
-
LangChain nodes enabled in your n8n instance
π₯ Step 1: Set Up OpenAI Credential
- Go to OpenAI API Keys
- Create and copy your key
- In n8n: Credentials β New β OpenAI API β paste key as βOpenAi accountβ
π₯ Step 2: Manual Trigger
- Add Manual Trigger to start the workflow
π₯ Step 3: Set Topic
- Add a Set node named
Set Topic to Search - Field:
Topic=n8n use cases(or any topic you choose)
β¨ Step 4: Generate Structured Data
- LangChain Agent node
Generate Random Data - Connect to OpenAI Chat Model1 and Tool: Inject Creativity1
- System prompt: instruct AI to output 5 columns of realistic values in JSON
π§ Step 5: Parse AI Output
- Structured Output Parser to validate JSON
π Step 6: Flatten Data
- Code node
Outpt all Data to One Field - Joins all values into a comma-separated string for column naming
π§ Step 7: Generate Column Names
- LangChain Agent
Generate Column Names - Connect to OpenAI Chat Model2
- Prompt: infer 5 column names from the string
π’ Step 8: Pivot Names Row
- Code node
Pivot Column Namestransforms array into{ column1: name1, β¦ }
πͺ Step 9: Split Columns
- 5
SplitOutnodes to break each array back into rows per column
π Step 10: Merge Rows
- Merge node
Merge Columns togetherusingcombineByPosition
π·οΈ Step 11: Rename Columns
- Set node
Rename Columnsassigns the AI-generated names to each column
π Step 12: Final Output
- Merge
Append Column Namescombines data and header row
π Done! You now have a fully AI-driven, labeled dataset generated from a single topicβno external services needed. Easily extend by adding a Google Sheets or HTTP node to export.
π¬ Need Help or Want to Customize This?
π§ robert@ynteractive.com
π LinkedIn
π Nodes Used
AI Agent, OpenAI Chat Model, Structured Output Parser, Think Tool
π₯ Import
Download workflow.json and import into n8n:
Workflow menu β Import from File