⚒️ Evaluation metric example: categorization

1,544 views · ⚒️ Engineering

Description

AI evaluation in n8n

This is a template for n8n’s evaluation feature.

Evaluation is a technique for getting confidence that your AI workflow performs reliably, by running a test dataset containing different inputs through the workflow.

By calculating a metric (score) for each input, you can see where the workflow is performing well and where it isn’t.

How it works

This template shows how to calculate a workflow evaluation metric: whether a category matches the expected one.

The workflow takes support tickets and generates a category and priority, which is then compared with the correct answers in the dataset.

🔗 Nodes Used

Webhook, AI Agent, OpenAI Chat Model, Structured Output Parser, Evaluation Trigger, Evaluation

📥 Import

Download workflow.json and import into n8n: Workflow menu → Import from File

📖 Importing guide · 🔑 Credential setup