Your production LLM application is returning incorrect outputs for a subset of users — about 3% of requests produce wrong answers that users are complaining about. You need to diagnose and fix the root cause. Walk through your debugging process step by step. Cover: how you reproduce the failures reliably (finding the pattern in logs), how you isolate whether the issue is retrieval failure vs generation failure vs prompt issue vs data quality, how you use prompt logging and tracing tools to inspect the exact context being sent to the model, how you build a regression test from the failures, and how you verify the fix works before deploying.