MediumAI EngineeringTheory

RAG System Retrieval Failure — Diagnosis and Fix

A production RAG system serving a legal research platform has a recall@5 of 34% — meaning only 34% of relevant documents appear in the top 5 retrieved results. The system uses:

OpenAI text-embedding-3-small for both indexing and queries
Pinecone with cosine similarity
Top-5 retrieval
No reranking

Users frequently report that the system misses relevant case law even when they know it exists in the database. Queries tend to be in plain language ("what are the rules around non-compete agreements in California") while documents use formal legal terminology ("enforceability of restrictive covenants under California Business and Professions Code").

Identify the most likely cause of the low recall. Be specific about what is failing in the retrieval pipeline.
Propose a two-stage retrieval pipeline that would improve recall. Describe each stage, what model or algorithm is used, and why it addresses the specific failure mode.
What is the recall/latency tradeoff of your proposed solution, and how would you measure whether it actually improved recall@5?

Sign in to attempt this problem

Free account gives you full access to community problems with the complete solution reveal — golden answer, senior walkthrough, and score breakdown — after submission.

Start free →Already have an account? Sign in