MediumML EngineeringTheory

Walk me through debugging a session with incorrect LLM outputs

Your production LLM application is returning incorrect outputs for a subset of users — about 3% of requests produce wrong answers that users are complaining about. You need to diagnose and fix the root cause. Walk through your debugging process step by step. Cover: how you reproduce the failures reliably (finding the pattern in logs), how you isolate whether the issue is retrieval failure vs generation failure vs prompt issue vs data quality, how you use prompt logging and tracing tools to inspect the exact context being sent to the model, how you build a regression test from the failures, and how you verify the fix works before deploying.

Sign in to attempt this problem

Free account gives you full access to community problems with the complete solution reveal — golden answer, senior walkthrough, and score breakdown — after submission.

Start free →Already have an account? Sign in