2026-06-13

The Jailbreak that Got Fable 5 Pulled Exists in Every Model

The Jailbreak that Got Fable 5 Pulled Exists in Every Model

On Friday, June 12, 2026, at 5:21pm ET, Anthropic received an order from the US government. By that evening, Claude Fable 5 and Claude Mythos 5, the two most capable models the company had ever shipped, went dark for every user on the planet. The official reason was a jailbreak.

I am Belgian. Under that order I count as a “foreign national,” which puts Fable off-limits to me because of my passport, not because of anything I ever did with it. I had a whole weekend planned around that model, which is the small and petty thing to be annoyed about.

Here is the larger thing. The order means the US government now believes it can reach into a commercial AI product used by hundreds of millions of people and switch it off. Once that is established, the precedent outweighs whatever the jailbreak was.

What the Order Actually Says

The directive, issued by the Commerce Department, bars access to Fable 5 and Mythos 5 by any foreign national, “whether inside or outside the United States, including foreign national Anthropic employees.” That scope is the important part. It covers any non-citizen anywhere, including Anthropic’s own staff, not only foreign governments or adversary states.

A company cannot reliably sort its users by nationality in real time, so the only way to comply was to switch both models off completely. Every other Claude model stayed online. Only the two best ones went dark.

The stated trigger was that another company claimed it had jailbroken Mythos. Anthropic complied with the order and disagreed with it in public, which a company under a national-security directive almost never does. It said it disagreed that “the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people,” and warned that the same standard, applied across the industry, “would essentially halt all new model deployments for all frontier model providers.”

They are right about that. And the reason they are right is technical.

A Jailbreak Describes Every Model Ever Shipped

Identical AI models on a shelf, each with the same hairline crack

A large language model does not look up answers. It generates them one token at a time, by sampling from a probability distribution over its entire vocabulary. The last step in that process, the softmax, hands a nonzero probability to every possible next token. Every single one.

That has a consequence people keep wishing were not true. No amount of safety training can push the probability of a harmful output all the way to zero. It can push it down, sometimes very far down, but never to nothing. There is always some sequence of words, however strange, that produces an answer it was trained to refuse. A “jailbreak” is just someone finding one of those paths. It is a property of how these systems work. A patch can lower the odds. It cannot reach zero.

So “we found a way to jailbreak this model” says nothing specific about Fable 5. It is equally true of GPT-5.5, of Gemini, of every open-weights model sitting on someone’s hard drive, and of every model Anthropic left running on June 12. Anthropic said exactly this: the capability behind the jailbreak is “widely available from other models (including OpenAI’s GPT-5.5), and is used every day by the defenders who keep systems safe.”

If a narrow jailbreak were really the bar for pulling a model, there would be no models. On safety grounds, Fable 5 is indistinguishable from the dozens of systems that stayed online. The reason it got singled out lives somewhere other than safety.

Why Fable, and Why Now

I have two readings, and I think both are partly true.

The first is that this is the next move in a fight that started months ago. In February 2026, the Pentagon gave Anthropic an ultimatum: drop your restrictions on mass surveillance and autonomous weapons, or lose all federal contracts. Anthropic refused. The administration ordered federal agencies off its products and labeled the company a national security risk. Hours later, OpenAI announced a Pentagon deal of its own. The lab that said no has been on this administration’s bad side ever since. An export control that happens to disable its flagship fits that pattern cleanly.

The second reading is about money, and here I am openly speculating. By Anthropic’s own published benchmarks, Fable 5 beat OpenAI’s GPT-5.5 and even Anthropic’s own Opus 4.8 on hard coding and cybersecurity tests, scoring 80.3% on SWE-bench Pro against GPT-5.5’s 58.6%. A model that strong going dark is worth a great deal to everyone who has to compete with it. I cannot prove any rival picked up a phone to lobby for this, and I will not pretend otherwise. But every competitor had an obvious reason to want Fable gone, and the order handed them exactly that. It would be naive to look away from that.

What I can say without speculating is that the official logic does not hold together. This is an administration that wants to sell advanced AI chips to China while barring Britons, and every other non-American on the planet, from using the best American models. It is hard to call that anything but cartoonish.

And there is a version of this where Anthropic helped write its own problem. The company spent a year telling the world that Mythos was dangerous, describing its own leaked model as powerful enough to pose serious cybersecurity risks. Market your product as a weapon for long enough, and you should not be surprised when the government finally treats it like one.

The Executive Order Said the Opposite

Two government orders dated one day apart, one promising no licenses, one imposing them

Here is the part that should bother you even if you trust the government’s motives completely.

On June 2, 2026, the President signed an executive order on AI that goes out of its way to promise a light touch. In black and white, it says nothing in it authorizes “a mandatory governmental licensing, preclearance, or permitting requirement for the development, publication, release, or distribution of new AI models, including frontier models.” Voluntary cooperation only, the order says. No licenses.

The day before, on June 1, Commerce Secretary Howard Lutnick had already sent Anthropic a letter placing Fable 5 and Mythos 5 under export controls, requiring government approval before either model could reach a non-US person. That is a license, under another name and through another door. The order swore off mandatory preclearance on June 2. The government had imposed exactly that on June 1, on a model already serving millions. One day apart. Whatever you want to call the result, the light-touch version exists only on the page that disclaims it.

We Tried This with Encryption

A 1990s floppy disk stamped with a military munitions emblem next to a vintage computer

None of this is new. In the 1990s, the US government classified strong encryption as a munition. Exporting it needed a license, under the same export regime that governs actual weapons. The government investigated Phil Zimmermann for more than two years over PGP, the email-encryption tool he released to the world. Then in 1999 the Ninth Circuit ruled in Bernstein v. United States that source code is speech protected by the First Amendment, jurisdiction over crypto moved to the Commerce Department, and within a year the controls had largely fallen away.

The export controls never stopped the math. They slowed American companies down for a few years while the technology spread everywhere regardless. Treating a capability as contraband failed when the capability was encryption. It fails harder when the capability is a model whose weights can fit in a file you copy in seconds.

What This Teaches the Labs

The lasting cost here is the lesson every other lab just learned.

Anthropic was open about all of it. It published detailed capability benchmarks, it talked openly about cyber risk, and when the jailbreak claim arrived, it concerned a model the company had loudly described as dangerous. The reward for that openness was an export-control order that took its best product offline. The quiet takeaway for everyone else watching is to say less: soften the benchmarks, bury the red-team findings, market the model as friendly and harmless. We just attached a penalty to honesty in the one industry where we most need the labs to tell us what their systems can actually do.

What to Watch Now

As of June 2026, it is too early to know whether this was a one-off or a template. A few things will tell us which.

Watch whether Fable 5 comes back, and on what terms. A quiet return wrapped in new restrictions is a very different signal from a model that stays buried.

Watch whether this same export logic ever lands on GPT-5.5 or Gemini, models with comparable capabilities, or whether it only ever seems to find the lab that refused the Pentagon.

Watch whether the labs go quiet. If capability disclosures and safety research get noticeably thinner over the next year, the lesson landed.

And watch the courts. The question at the center of Bernstein, whether code counts as speech, is sitting right there waiting to be asked about model weights. Someone is going to ask it.

The technology itself will get out. It always does, because the math does not care about export controls. The open question is how much we damage, in trust and in plain honesty about what these systems can do, on the way to relearning a lesson we already learned a quarter century ago.

Want this kind of analysis in your inbox once a month?