Compare few-shot and zero-shot prompting strategies for production LLM applications. When does zero-shot outperform few-shot? When does adding examples hurt rather than help? How do you select good few-shot examples (diversity, ordering, format consistency)? What is the token cost of few-shot and how do you manage it at scale? Give three concrete examples: a task where zero-shot is clearly better, one where few-shot is essential, and one where chain-of-thought reasoning beats both.