Your production LLM pipeline needs consistent, repeatable outputs — but LLMs are stochastic by nature. Walk through your strategy for making outputs as deterministic as possible. Cover: temperature and top-p settings and their tradeoffs, when to use structured output formats (JSON mode, function calling, grammar-constrained decoding), how to use output validation and retry logic, caching strategies for identical prompts, and how to handle the cases where non-determinism is actually desirable. What do you do when the model generates valid but inconsistent JSON schemas?