You are building an agentic system that executes multi-step tasks. Halfway through a 10-step task the LLM produces an invalid response — wrong format, hallucinated tool call, or refused to continue. Design your fallback architecture. Cover: how to detect failure modes (validation, parsing errors, semantic validation), retry strategies with different prompts or temperatures, graceful degradation to simpler approaches, when to escalate to a human, how to preserve partial work so you can resume rather than restart, and how to prevent cascading failures in multi-agent systems.