By manipulating conversational context over multiple turns, the jailbreak attack bypasses safety measures that prevent GPT-5 from generating harmful content.
So much for “safe completions” — turns out if you can’t trip the guardrails in one go, you can just walk the model there slowly.
So much for “safe completions” — turns out if you can’t trip the guardrails in one go, you can just walk the model there slowly.