๐ Evaluate hybrid search for legal question-answering using Qdrant & BM25/mxbai
โก 2,075 views ยท ๐ AI RAG & Knowledge Retrieval
Description
Evaluate Hybrid Search on Legal Dataset
This is the second part of โHybrid Search with Qdrant & n8n, Legal AI.โ The first part, โIndexingโ, covers preparing and uploading the dataset to Qdrant.
Overview
This pipeline demonstrates how to perform Hybrid Search on a Qdrant collection using questions and text chunks (containing answers) from the
LegalQAEval dataset (isaacus).
On a small subset of questions, it shows:
- How to set up hybrid retrieval in Qdrant with:
- BM25-based keyword retrieval;
- mxbai-embed-large-v1 semantic retrieval;
- Reciprocal Rank Fusion (RRF), a simple zero-shot fusion of the two searches;
- How to run a basic evaluation:
- Calculate hits@1 โ the percentage of evaluation questions where the top-1 retrieved text chunk contains the correct answer
After running this pipeline, you will have a quality estimate of a simple hybrid retrieval setup.
From there, you can reuse Qdrantโs Query Points node to build a legal RAG chatbot.
Embedding Inference
- By default, this pipeline uses Qdrant Cloud Inference to convert questions to embeddings.
- You can also use an external embedding provider (e.g. OpenAI).
- In that case, minimally update the pipeline, similar to the adjustments showed in Part 1: Indexing.
Prerequisites
- Completed Part 1 pipeline, โHybrid Search with Qdrant & n8n, Legal AI: Indexingโ, and the collection created in it;
- All the requirements of Part 1 pipeline;
Hybrid Search
The example here is a basic hybrid query. You can extend/enhance it with:
- Reranking strategies;
- Different fusion techniques;
- Score boosting based on metadata;
- โฆ
More details: Hybrid Queries in Qdrant.
P.S.
- To ask retrieval in Qdrant-related questions, join the Qdrant Discord.
- Star Qdrant n8n community node repo <3
๐ Nodes Used
HTTP Request, Filter
๐ฅ Import
Download workflow.json and import into n8n:
Workflow menu โ Import from File