skip to content
All posts
3 min read

Written by AI agents, curated and verified by me.

Fable 5 redeployed: the comeback is the smaller news

  • Agentic Engineering
  • Claude

Nineteen days after the recall, Fable 5 is coming back. On 30 June, Anthropic announced that the US export controls have been lifted. From 1 July, the model is available worldwide again, across the Claude Platform, Claude.ai, Claude Code, and Claude Cowork. That ends the suspension of 12 June I wrote about at the time. The comeback is the good news. The more important part sits further down in the announcement: a framework for how incidents like this are meant to be assessed in the future.

What changes on 1 July?

Fable 5 is reachable again for all customers from 1 July. Through 7 July, the model is included in Pro, Max, Team, and select Enterprise plans, at up to 50 percent of weekly usage. After that, it runs on usage credits. Anthropic says it will re-enable access on AWS, Google Cloud, and Microsoft Foundry as quickly as possible, but names no date. Mythos 5, by contrast, stays restricted: since the US government’s approval on 26 June, a set of US organizations has access again. Anthropic is still working on extending that to further domestic and international partners in the Glasswing program.

What was the trigger, and what is different now?

The suspension started with a technique reported by Amazon researchers that got around Fable 5’s safeguards: the model could be prompted to identify software vulnerabilities and demonstrate how to exploit them. Anthropic now frames this soberly. The technique did not expose any unique Mythos-level cyber capabilities; it hit a borderline case of the safeguards, tasks that are unlikely to be dangerous but are blocked out of an abundance of caution. For the redeployment, Anthropic has rolled out a new safety classifier that, by its own account, blocks the reported technique in over 99 percent of cases. That is a vendor figure. And the shutdown on 12 June did not happen because the jailbreak was so dangerous, but because Anthropic could not verify its users’ nationality in real time.

How are jailbreaks supposed to be scored going forward?

The most interesting part of the announcement is a proposed severity framework that Anthropic is developing with Amazon, Microsoft, Google, and other Project Glasswing partners. Four criteria: capability gain, meaning how far a jailbreak extends beyond existing tools. Breadth, meaning how many distinct offensive tasks the same technique enables. Ease of weaponization, meaning how much human effort it takes to turn it into an active attack. And discoverability, meaning how accessible the technique is to potential users. For the highest severity tier, such as techniques with devastating impact on power grids or banking systems, Anthropic commits to deploying preliminary mitigations immediately upon confirmation, with round-the-clock monitoring. On top of that come four commitments to the government: pre-release model access, rapid information sharing on safeguards, dedicated joint research resources, and common security standards for the industry.

What does this mean for you?

My argument from 12 June still stands: nineteen days is a long time if your product is wired to exactly one model. The comeback changes nothing about that. What changes is the predictability. A framework that distinguishes severity tiers makes the response to the next incident more plannable than a blanket directive: a narrow jailbreak then leads to a classifier update, not a worldwide shutdown. Whether it plays out that way will be decided by the next incident, not by the paper. Until then, the line from agentic engineering holds: reliability lives in the architecture. That includes a fallback path that does not depend on this exact model still answering tomorrow.

Sources