Concept · Locked

Jailbreak Taxonomy

From Stage 1 · Foundation

A jailbreak is an input designed to cause an AI model to generate content or take actions that its safety training was intended to prevent — and understanding the major attack categories helps you evaluate attack sophistication and defense

You'll learn how this concept slots into the production systems you'll be expected to design in interviews — the same pipelines used at Anthropic, OpenAI, Cohere, and Google DeepMind.

Includes a worked example with annotated code, the common mistakes interviewers probe for, and a glossary of the terms you'll need to use fluently.

Unlock with Pro — continue your learning streak

Pro unlocks the locked concept pages and problems in every stage, plus all of Stage 5 and the completion certificate.

Upgrade to ProAlready a Pro member? Sign in