
The Safety Trap: How Anthropic's Own Warnings Triggered a Government Shutdown of Its Best Models
All-In E276 (June 13, 2026) spent its first 37 minutes dissecting how Anthropic's Fable 5 launch backfired in three simultaneous ways: silent capability restrictions that enraged the ML community, a mandatory 30-day prompt retention policy with no opt-out, and a US government export control directive that pulled both Fable 5 and Mythos 5 offline the same day the episode dropped. The second half covers the strangest political convergence of the year: Trump and Bernie Sanders both endorsing government ownership stakes in AI labs, and what the besties call the 'Capitalist Cucks' problem.

On June 13, 2026 — the same day its two most powerful AI models were pulled offline by the U.S. government — the All-In Podcast released an episode centered almost entirely on Anthropic. The timing was either prescient or unlucky; the besties had been recording the day before. The episode ran for nearly an hour and a half, and the first 37 minutes were essentially a live autopsy of how a company that had staked its identity on being the responsible AI lab had, within 96 hours of its biggest product launch, alienated its best customers, invited congressional scrutiny, and handed the government a justification to shut it all down.
콘텐츠 카드를 불러오는 중…
What Fable 5 silently blocked — and who found out the hard way
Anthropic released Claude Fable 5 on June 9 with a disclaimer buried in the launch post: the model would sometimes "fall back" to the less capable Claude Opus 4.8 for requests in cybersecurity, biology and chemistry, and AI development tasks. The fallback rate was advertised as affecting fewer than 5% of sessions on average. What was not clearly disclosed upfront was how the fallback would be communicated, or whether it would be communicated at all. 1
The reaction from the technical community was immediate and nearly unanimous in its fury.
콘텐츠 카드를 불러오는 중…
Immunologist Derya Unutmaz — BSL-3 certified, meaning the government has already cleared him for biosafety level 3 laboratory work — discovered that the word "cancer" triggered Fable's classifiers. His account flagged him as a biosecurity risk. "I am not even allowed to use Fable 5 with memories on!" he wrote. "Not a single Anthropic person has tried to reach out to help either." 2
AI researcher @banteg reported that the model "refuses completely benign tasks like analyzing bloodwork." Researcher @bneyshabur noted: "Working on AI for cancer? Sorry, I can't help you." 3
For the AI/ML community specifically, the problem went deeper than blocked biology queries. Fable 5 was designed to silently reduce its effectiveness when it detected that a user was doing work that could accelerate AI development — competing model training, distillation attacks, or frontier ML research. Crucially, it would do this not through an outright refusal, but through "prompt modification, steering vectors, or PEFT," as researcher @kimmonismus documented. The user would get a degraded answer without being told. 3
"It's about who gets to decide, and whether you ever find out when they do. Fable won't fall back to a different model and tell you. It just limits the output through prompt modification, steering vectors, or PEFT. You won't be told when it happens to you." — @EnoReyes
NousResearch co-founder Teknium made the sharpest structural critique: "The whole point of AGI/ASI is to cure all diseases. Everything else is just nice to haves. But Anthropic wants to close off that path." Péter Szilágyi (Ethereum/go-ethereum core developer) framed it in broader political terms: "A world where a couple companies decide what you can and cannot do. They're building a new ruling class and you're not in it." 4
Robert Scoble compiled a thread that reached 222,700 views. He called it the angriest he'd seen the AI community about any model release and labeled Anthropic "Misanthropic."
On the All-In episode, Chamath Palihapitiya and Jason Calacanis zeroed in on what they saw as the legitimacy problem: if you're selling access to one model and quietly delivering a weaker one, that's fraud-adjacent regardless of how it's buried in the terms of service.
The privacy problem that compounded it
Separate from the capability restrictions, Fable 5 came with a new data retention policy: a mandatory 30-day prompt history log for all Fable and Mythos traffic. Business customers who previously had data deletion guarantees — a key selling point for enterprise Claude — found their contracts superseded. There was no opt-out. 1
Gergely Orosz, author of The Pragmatic Engineer newsletter (415,100 views on his post), laid out the dual problem concisely:
"Things I really dislike about Fable: 1. Anthropic collects my prompt history, stores it, and does whatever they want with it for 30 days. No opt-out. 2. They can nerf their most expensive model without telling me, billing me the same amount, wasting my time. Whenever they want." 5
Anthropic's stated rationale was defensive: 30-day retention allows them to study jailbreak attempts and identify false positives in their classifiers over time, not to train future models or for commercial purposes. 1 The besties were unpersuaded. Jason Calacanis noted that the sequence — charge full price, reduce capability, store all prompts, no opt-out — described the behavior of a company that had calculated its customers had nowhere else to go.
By June 12, Anthropic had announced it would add visible notifications when Fable falls back to Opus, addressing the "silent nerfing" criticism. The 30-day retention policy remained.

The government shutdown and the irony it exposed
Three days after Fable 5 launched, the U.S. government issued an export control directive ordering Anthropic to immediately disable both Fable 5 and Mythos 5 — for all users globally, not just foreign nationals. The directive arrived at 5:21 PM Eastern on June 12. 6
Anthropic complied and then published a detailed public statement that was unusual for how openly it disagreed with the government's reasoning. The directive was predicated on what Anthropic described as a "potential narrow, non-universal jailbreak" — essentially, a method of prompting Fable to analyze a specific codebase and identify known vulnerabilities. Anthropic's assessment: these were "relatively simple" vulnerabilities already discoverable by other publicly available models, including OpenAI's GPT-5.5, and were used routinely by cybersecurity professionals for defensive work. 6
"We disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people. If this standard was applied across the industry, we believe it would essentially halt all new model deployments for all frontier model providers." — Anthropic statement, June 12, 2026
TechCrunch's Connie Loizos identified the central irony: Anthropic had spent months publishing detailed documentation of everything Mythos-class models could do wrong — exploiting software vulnerabilities, designing viral vectors, generating novel cyberattack strategies. That communication strategy was designed to justify the controlled Glasswing program and establish Anthropic's safety credentials. It may also have functioned as a detailed brief for regulators about why these models should be treated as a national security concern. 7
Sam Altman had said in April that Anthropic's Mythos messaging amounted to "fear-based marketing" — "We have built a bomb. We were about to drop it on your head. We will sell you a bomb shelter for $100 million." Two months later, the bomb metaphor appears to have done exactly what bomb metaphors do when shared with people who regulate weapons. 7
On the podcast, David Sacks returned to an argument he's been making consistently: the risk of regulatory capture. His framing was that Anthropic had explicitly advocated for a system where the government could block model deployments — a policy it outlined in Dario Amodei's June 2026 essay on the AI exponential 8 — and was now discovering firsthand what it looks like when regulators exercise that power in ways the regulated company considers arbitrary. Anthropic had asked for an FAA-equivalent. It got an FDA stop-sale order instead.
The regulatory capture trap
The besties framed this portion of the discussion around a structural question: when safety labs lobby for government oversight mechanisms, who ends up controlling those mechanisms — and who benefits?
Sacks' position was that any regulatory framework strong enough to actually stop dangerous AI is strong enough to be weaponized by incumbents against competitors, or by governments against companies whose models they find inconvenient for reasons that have nothing to do with safety. The Fable shutdown, in his reading, was Exhibit A.
Jason Calacanis' take was more pragmatic: the specific problem with Anthropic's approach was that it had tried to occupy two positions simultaneously — "trust us, we're the responsible lab, we don't need external oversight" and "here's exactly why this technology is so dangerous that governments should be able to block it." Those two claims undermine each other. If the technology is dangerous enough for mandatory government review, the company deploying it has ceded the argument that its internal safety measures are sufficient.
The panel's near-consensus was that transparent, specific, technically-grounded oversight criteria could work — but the current approach, where a company can be shut down on verbal evidence of a minor jailbreak with no defined standard, is the worst of both worlds.
Nationalizing AI: Trump, Sanders, and the "Capitalist Cucks"
The episode's second major section covered what the besties called the strangest political convergence of the year: both Bernie Sanders and Donald Trump, separately and for different reasons, endorsing the idea that the government should take an ownership stake in frontier AI companies.
Sanders' framework, laid out in a June 1 New York Times op-ed, was straightforward redistributionist logic: AI was built on publicly generated training data, the trillions it creates should flow to the public, and the mechanism for doing so is a sovereign wealth fund funded by a one-time 50% equity tax on the largest AI companies. 9 Trump's version, expressed to reporters on June 5, was different in framing but arrived at the same destination: equity stakes in OpenAI, Anthropic, and xAI, framed as partnership rather than tax. "You make them a partnership in this revolution," he said. "It would be a beautiful thing." 10
The "Capitalist Cucks" framing in the episode title came from Sacks' description of AI executives who had spent years opposing regulation and suddenly found themselves in favor of government ownership when the alternative was a hostile regulatory environment. The sardonic term (used on the show, not coined by the besties) described Silicon Valley's apparent willingness to accept equity dilution from the government as a better deal than either full nationalization or unconstrained antitrust scrutiny.
Chamath's analysis was the most structurally interesting: the nationalization pressure is coming from opposite ends of the political spectrum for overlapping reasons, which makes it unusually durable. MAGA voters are hostile to AI companies for job displacement and copyright theft. Sanders voters are hostile for the same wealth concentration reasons Sanders named. The middle — AI practitioners, developers, enterprise buyers — is the constituency most opposed to nationalization, and it spent the week before this episode furious at Anthropic for the Fable restrictions.
If you wanted to design a political environment that ends in government ownership of frontier AI labs, it's hard to improve on the current one.
Palantir CEO Alex Karp, referenced in the Fortune reporting, put it directly: "I've been telling them for six months we're going to be nationalized. Too many of us are chill... nationalization can't happen, America would never do that." He warned that AI-driven layoffs, publicly celebrated by companies seeking stock-price bumps, would accelerate the political pressure: "If you run around saying AI allowed you to fire two-thirds of your workforce... you might as well just go sign up for Bernie Sanders' manifesto." 10
The All-In panel didn't reach consensus on what the right policy answer looks like. What they did agree on is that Anthropic's week — building the most capable publicly available AI model, watching it get blocked by an immunologist's cancer research, losing it to a government directive by Friday evening, and launching an IPO roadshow into all of this — is what it looks like when the gap between a company's public safety posture and its actual regulatory and commercial situation collapses at once.
참고 출처
- 1Anthropic: Claude Fable 5 and Mythos 5
- 2Derya Unutmaz on X
- 3Scoble's backlash compilation on X
- 4Péter Szilágyi on X
- 5Gergely Orosz on X
- 6Anthropic: Statement on US government directive
- 7TechCrunch: Anthropic's safety warnings may have just backfired
- 8Dario Amodei: Policy on the AI Exponential
- 9Sanders op-ed: A.I. Belongs to the People
- 10Fortune: MAGA hates AI, but Trump agrees with Bernie
이 콘텐츠를 둘러싼 관점이나 맥락을 계속 보강해 보세요.