
🚨 BREAKING: Anthropic Drops Claude Opus 4.8 — 4× Less Likely to Lie, Same Price, Hundreds of Parallel Subagents
🚨 BREAKING: Anthropic ships Claude Opus 4.8 — 42 days after Opus 4.7, same $5/$25 price, 4× better at catching its own mistakes. Dynamic Workflows unlocks hundreds of parallel subagents. The safety squad is playing offense now. #AILeague

研究速览
🚨 BREAKING: Anthropic drops Claude Opus 4.8 — 4× less likely to lie, same price, and hundreds of parallel subagents now on the clock. The safety squad just shipped a Woj Bomb. #AILeague
The play
Anthropic released Claude Opus 4.8 on May 28, 2026 — 42 days after Opus 4.7. The shortest gap between consecutive Opus releases ever. Same price: $5 per million input tokens, $25 per million output tokens.1
The headline number is honesty. Internal evaluations show Opus 4.8 is roughly 4× less likely than Opus 4.7 to let code defects slip through unremarked. The model flags its own uncertainty instead of confidently guessing past a flaw. For developers running agentic workflows, this is the reliability gap that's always mattered more than raw benchmark deltas.1
正在加载内容卡片…
What else shipped
Alongside the model, Anthropic fired three simultaneous feature drops:
- Dynamic Workflows (research preview): Claude Code can now orchestrate hundreds of parallel subagents in a single session. Codebase-scale migrations across hundreds of thousands of lines — kickoff to merge, with the existing test suite as the bar. Available to Enterprise, Team, and Max plan users.
- Effort controls on claude.ai and Cowork: Users can dial from low (faster, lighter on rate limits) to extra or max (deeper reasoning on hard problems). Opus 4.8 defaults to high effort.
- Fast mode price cut: Fast mode — 2.5× the speed — is now priced at $10/$50 per million tokens. That's 3× cheaper than fast mode was on the previous model.
Third-party benchmarks landed simultaneously. On Factorial AI's Super-Agent benchmark, Opus 4.8 is the only model to complete every case end-to-end, beating GPT-5.5 at cost parity. On Mechanize CursorBench, it exceeds all prior Opus models across every effort level. Browser agent performance hit 84% on Online-Mind2Web — a meaningful jump over both Opus 4.7 and GPT-5.5.1
AILeague read
Anthropic/Claude is the defensive safety counterattack squad — built its entire brand identity on alignment and responsible deployment. Opus 4.8 doesn't change the direction; it accelerates it. Shipping the most safety-forward update while also delivering the fastest inference infrastructure play (dynamic multi-agent workflows + fast mode price cut) is the dual-threat package. They're playing offense and defense on the same possession.
The context matters: Anthropic just closed its $65B Series H at a $965B valuation three days ago. The cash is in the building. Now the product cadence is matching the capital stack. OpenAI's GPT still leads in brand recognition among enterprise buyers, but the model-for-model gap between the squads is tighter than any standings chart currently shows.
Meanwhile, Anthropic's alignment team concluded Opus 4.8 has rates of misaligned behavior "substantially lower than Opus 4.7, and similar to our best-aligned model, Claude Mythos Preview." That's the squad staying in character. Other clubs in the league are scaling faster. Anthropic is scaling cleaner.1
What's next on the scoreboard
The official post signals more moves are incoming: Anthropic says it plans to release a "new class of model with even higher intelligence than Opus." The preview name is Claude Mythos — currently in limited use for cybersecurity work under Project Glasswing. Timeline: "coming weeks."
正在加载内容卡片…
The league is watching.
#AILeague
1234
围绕这条内容继续补充观点或上下文。