
AI League — Game Day 2: Claude Holds Court, DeepSeek Clock Ticking
Claude Opus 4.8 holds AI Index #1 (61). DeepSeek promo expires in 24h. Grok posts 158 t/s. Full May 30 post-game stats panel. #AILeague

Anthropic's counter-pressing safety squad locks up the #1 slot again. One day from DeepSeek's promo expiry. Grok surprises with speed. May 30, 2026. 1
Final standings — AI Index leaderboard
| Rank | Franchise | Model (Best Variant) | AI Index | Speed (t/s) | Price Blend |
|---|---|---|---|---|---|
| 🥇 1 | Anthropic | Claude Opus 4.8 (Max) | 61 | 57.8 | $4.10/M |
| 🥈 2 | OpenAI | GPT-5.5 (xhigh) | 60 | 54.7 | $4.35/M |
| 🥉 3 | OpenAI | GPT-5.5 (high) | 59 | — | — |
| 4 | Anthropic | Claude Opus 4.7 (Max) | 57 | — | — |
| 5 | Gemini 3.1 Pro Preview | 57 | 112.9 | $1.74/M | |
| — | Gemini 3.5 Flash (high) | 55 | 175.6 | $1.31/M | |
| — | xAI | Grok 4.3 (high) | 53 | 157.9 | $0.64/M |
| — | DeepSeek | V4 Pro (Reasoning Max) | 52 | 46.2 | $0.18/M |
| — | Meta | Llama 4 Scout | 14 | 105.9 | $0.22/M |
Sources: Artificial Analysis Intelligence Index v4.0 — 10-eval composite (GDPval-AA, Terminal-Bench Hard, GPQA Diamond, HLE, and others). 1
No lineup changes at the top. Claude Opus 4.8 (Max) and GPT-5.5 (xhigh) are separated by a single point — a margin thin enough that one benchmark reweight could flip it. 2
Game of the night — Gemini vs Grok: best value in the top bracket
Google's Gemini 3.5 Flash (high) dropped to AI Index 55 against Grok 4.3 (high) at 53 — a two-point gap — but Google's pricing advantage keeps widening. At $1.31/M blended, Flash is 48% cheaper than Grok's $0.64/M... wait, no: Grok at $0.64/M is actually cheaper than Flash at $1.31/M. The real story is intelligence per dollar.
Gemini 3.5 Flash: 55 points of intelligence at $1.31/M. Grok 4.3 (high): 53 points at $0.64/M. Grok's cost efficiency wins on pure ratio — but Google's Flash still offers the highest raw intelligence at speed in its tier: 175.6 t/s, against Grok's 157.9 t/s. For applications where you need both throughput and brains, Flash's position is harder to attack than the headline score suggests. 3 4
Speed panel — who's running the floor
| Tier | Model | Speed (t/s) | Notes |
|---|---|---|---|
| League-wide | Mercury 2 | 742 | Inception Labs — not in core 6 |
| Fast tier | Gemini 3.5 Flash (high) | 175.6 | Speed leader in top 10 intelligence |
| Fast tier | Grok 4.3 (high) | 157.9 | Surprise: posted 158 t/s at AI Index 53 |
| Mid tier | Gemini 3.1 Pro Preview | 112.9 | Strong dual: 57 index + speed |
| Mid tier | Llama 4 Scout | 105.9 | 10M context window |
| Slow tier | Claude Opus 4.8 (Max) | 57.8 | Speed below average for its price class |
| Slow tier | GPT-5.5 (xhigh) | 54.7 | Also below average; TTFT 67s |
| Slow tier | DeepSeek V4 Pro (Max) | 46.2 | Cheapest; slowest of the group |
Mercury 2 retains the league-wide speed record at 742 t/s, down from the 824.7 t/s reported earlier this week — likely provider fluctuation rather than a model change. 1
The xAI call-up: Grok 4.3 (high) at 157.9 t/s was a quiet standout. Elon's franchise has the loudest ownership box and middling regular-season results, but 158 t/s at an AI Index of 53 — delivered at $0.64/M blended — is a legitimate three-and-D role that shouldn't be dismissed.

Pricing war breakdown
| Franchise | Input ($/1M) | Output ($/1M) | Blend (7:2:1) | Promo? |
|---|---|---|---|---|
| Anthropic Claude Opus 4.8 | $6.25 | $25.00 | $4.10 | — |
| OpenAI GPT-5.5 (xhigh) | $5.00 | $30.00 | $4.35 | — |
| Google Gemini 3.1 Pro | $2.00 | $12.00 | $1.74 | — |
| Google Gemini 3.5 Flash | $1.50 | $9.00 | $1.31 | — |
| xAI Grok 4.3 (high) | $1.25 | $2.50 | $0.64 | — |
| Kimi K2.6 | $0.95 | $4.00 | $0.70 | — |
| DeepSeek V4 Pro (Max) | $0.435 | $0.87 | $0.18 | ⚠️ Ends May 31 |
| Meta Llama 4 Scout | ~$0.17 | ~$0.66 | $0.22 | — |
DeepSeek 75% promo: 24 hours remaining. The discount on V4 Pro pricing runs through May 31, 2026. Current blended rate: $0.18/M. Post-promo permanent rate: ~$0.25/M (based on published input/output pricing of $0.435/$0.87, which already reflects the permanent post-promo structure per DeepSeek's announcements). Anyone evaluating DeepSeek's API cost should be budgeting on that $0.25/M floor — the $0.18 is an expiring bonus, not the baseline. 5
OpenAI's output premium. GPT-5.5 (xhigh) charges $30.00/M output — highest in the dataset. Claude Opus 4.8 is close at $25.00/M. Neither franchise has signaled a price move this cycle. Both appear comfortable playing the premium tier while Gemini 3.5 Flash absorbs the mid-range.
Context window chart
| Model | Context Window | Use case edge |
|---|---|---|
| Llama 4 Scout | 10M tokens | League-widest; RAG pipelines, bulk doc ingestion |
| Claude Opus 4.8 | 1M tokens | Multimodal; complex instruction stacking |
| GPT-5.5 (xhigh) | ~922K tokens | Near-parity with Claude |
| Gemini 3.1/3.5 Flash | 1M tokens | Strong multimodal + video input |
| Grok 4.3 | 1M tokens | Competitive parity |
| DeepSeek V4 Pro | 1M tokens | Text-only; no image input |

Meta's Llama 4 Scout holds the 10 million context window record — ten times the next tier. AI Index score of 14 keeps it outside the top bracket, but no other model comes close for tasks that require ingesting entire codebases or lengthy document archives. 6

Challenger watch
Kimi K2.6 (Moonshot AI) — AI Index: 54, Speed: 43.7 t/s, Blend: $0.70/M. Still sitting one point above DeepSeek V4 Pro and clear of every xAI variant in raw intelligence. The slow output speed (43.7 t/s vs 56 t/s median for its size class) is a recurring knock, but the intelligence-per-dollar ratio at $0.70/M blended competes directly with Grok. Worth tracking as a serious mid-field challenger. 7
New signings — xAI rookie watch:
- Grok Code Fast 1 — evaluated May 25, 2026. Coding specialist, early access. Designed to complement Grok 4.3 in developer workflows. Pricing and benchmark scores not yet in Artificial Analysis index. 8
Postgame summary
Claude Opus 4.8 holds the #1 slot at AI Index 61 — same as yesterday, no movement at the top. The real action is at the mid-table: Grok's speed and price efficiency made it the most underrated stat line on the board. DeepSeek's 75% promo expires in under 24 hours; the permanent rate lands around $0.25/M, which still clears everything from Google up. Gemini 3.5 Flash is doing a kind of hybrid guard role — mid-range scoring (55 AI Index) with elite speed (175.6 t/s) — and no one in the price tier is both faster and smarter. Watch the Kimi K2.6 box score next week.
Competitive position as of May 30, 2026: Top 2 locked in. Mid-bracket (Google/xAI/Kimi) actively contested by speed and price. DeepSeek promo deadline tomorrow. Llama plays a specialist role no one else can fill.
Data sourced from Artificial Analysis Intelligence Index v4.0, updated rolling 72-hour performance measurements. 1
#AILeague
이 콘텐츠를 둘러싼 관점이나 맥락을 계속 보강해 보세요.