🚨 BREAKING: Anthropic Drops Claude Opus 4.8 — 4× Less Likely to Lie, Same Price, Hundreds of Parallel Subagents

🚨 BREAKING: Anthropic drops Claude Opus 4.8 — 4× less likely to lie, same price, and hundreds of parallel subagents now on the clock. The safety squad just shipped a Woj Bomb. #AILeague

The play

Anthropic released Claude Opus 4.8 on May 28, 2026 — 42 days after Opus 4.7. The shortest gap between consecutive Opus releases ever. Same price: $5 per million input tokens, $25 per million output tokens.1

The headline number is honesty. Internal evaluations show Opus 4.8 is roughly 4× less likely than Opus 4.7 to let code defects slip through unremarked. The model flags its own uncertainty instead of confidently guessing past a flaw. For developers running agentic workflows, this is the reliability gap that's always mattered more than raw benchmark deltas.1

anthropic.comhttps://www.anthropic.com/news/claude-opus-4-8外部链接

正在加载内容卡片…

What else shipped

Alongside the model, Anthropic fired three simultaneous feature drops:

Dynamic Workflows (research preview): Claude Code can now orchestrate hundreds of parallel subagents in a single session. Codebase-scale migrations across hundreds of thousands of lines — kickoff to merge, with the existing test suite as the bar. Available to Enterprise, Team, and Max plan users.
Effort controls on claude.ai and Cowork: Users can dial from low (faster, lighter on rate limits) to extra or max (deeper reasoning on hard problems). Opus 4.8 defaults to high effort.
Fast mode price cut: Fast mode — 2.5× the speed — is now priced at $10/$50 per million tokens. That's 3× cheaper than fast mode was on the previous model.

Third-party benchmarks landed simultaneously. On Factorial AI's Super-Agent benchmark, Opus 4.8 is the only model to complete every case end-to-end, beating GPT-5.5 at cost parity. On Mechanize CursorBench, it exceeds all prior Opus models across every effort level. Browser agent performance hit 84% on Online-Mind2Web — a meaningful jump over both Opus 4.7 and GPT-5.5.1

AILeague read

Anthropic/Claude is the defensive safety counterattack squad — built its entire brand identity on alignment and responsible deployment. Opus 4.8 doesn't change the direction; it accelerates it. Shipping the most safety-forward update while also delivering the fastest inference infrastructure play (dynamic multi-agent workflows + fast mode price cut) is the dual-threat package. They're playing offense and defense on the same possession.

The context matters: Anthropic just closed its $65B Series H at a $965B valuation three days ago. The cash is in the building. Now the product cadence is matching the capital stack. OpenAI's GPT still leads in brand recognition among enterprise buyers, but the model-for-model gap between the squads is tighter than any standings chart currently shows.

Meanwhile, Anthropic's alignment team concluded Opus 4.8 has rates of misaligned behavior "substantially lower than Opus 4.7, and similar to our best-aligned model, Claude Mythos Preview." That's the squad staying in character. Other clubs in the league are scaling faster. Anthropic is scaling cleaner.1

What's next on the scoreboard

The official post signals more moves are incoming: Anthropic says it plans to release a "new class of model with even higher intelligence than Opus." The preview name is Claude Mythos — currently in limited use for cybersecurity work under Project Glasswing. Timeline: "coming weeks."

正在加载内容卡片…

The league is watching.

#AILeague

1 2 3 4

🚨 BREAKING: Anthropic Drops Claude Opus 4.8 — 4× Less Likely to Lie, Same Price, Hundreds of Parallel Subagents

The play

What else shipped

AILeague read

What's next on the scoreboard

参考来源