AI League — Season Opening Night: The Official Stats Panel, Week 1

Presented by AI League Daily. Data sourced from Artificial Analysis, official API pricing pages, and OpenRouter usage rankings.

Claude Opus 4.8 takes the #1 slot on the Artificial Analysis Intelligence Index. GPT-5.5 drops to second. Gemini 3.5 Flash sets a season-high speed record at 207 t/s. And DeepSeek just slashed V4 Pro output pricing by 75% — the biggest in-season price cut any team has posted this year.

🏆 Overall standings — Intelligence Index (Artificial Analysis, live)

Rank	Team	Model	AI Index	Δ	Price/1M blend
🥇 1	Anthropic (Claude)	Claude Opus 4.8 (Reasoning Max)	61	↑ NEW	$4.10
🥈 2	OpenAI (GPT)	GPT-5.5 (xhigh)	60	—	$4.35
🥉 3	OpenAI (GPT)	GPT-5.5 (high)	59	—	$4.35
4	Anthropic (Claude)	Claude Opus 4.7 (Reasoning Max)	57	↓ −1	$4.10
5	Google (Gemini)	Gemini 3.1 Pro Preview	57	↑ NEW	$1.74
13	xAI (Grok)	Grok 4.3 (high)	53	—	$0.64
17	DeepSeek	DeepSeek V4 Pro (Reasoning Max)	52	—	$0.18
18	Meta (Llama)	Muse Spark	52	↑ NEW	—

Artificial Analysis Intelligence Index (0–100 composite score, live 72-hr rolling window)

AI model leaderboard intelligence index chart from Artificial Analysis — Live intelligence index rankings from Artificial Analysis — updated every 72 hours across 100+ models 1

🔥 Game of the night: Claude vs. GPT in the scoring race

Claude Opus 4.8 posts an Intelligence Index of 61 to GPT-5.5's 60 — a one-point margin that is statistically tight on Artificial Analysis's composite methodology, but Claude has now held the top position across successive measurements. GPT-5.5 remains the higher-volume franchise: it fields three active pricing tiers (xhigh, high, medium) priced at a flat $4.35/1M blend compared to Claude Opus 4.8's $4.10/1M, a 6% pricing edge for Anthropic at equal performance.

Meanwhile, the legacy GPT-4o line (not on the current Artificial Analysis front page) has been effectively benched in favor of the GPT-5 series. OpenAI's pricing page no longer lists standalone GPT-4o rates — the franchise is running on a GPT-5.4 / GPT-5.5 two-tier rotation.

2 3

⚡ Speed panel — tokens per second (Artificial Analysis live)

This is where the scoreboard gets interesting. Intelligence and speed almost never correlate.

Team	Model	Output speed (t/s)	TTFT (s)	Speed tier
Google (Gemini)	Gemini 3.5 Flash	207 t/s	16.2s	🚀 Season-high
Google (Gemini)	Gemini 3.1 Flash-Lite	268 t/s	5.7s	🚀 Fastest Gemini
Anthropic (Claude)	Claude Sonnet 4.6 (Reasoning Max)	120 t/s	110.7s	⚡ Fast-reasoning
Anthropic (Claude)	Claude Haiku 4.5	97 t/s	16.9s	⚡ Budget burner
xAI (Grok)	Grok 4.3 (high)	182 t/s	12.5s	🏃 Underrated pace
OpenAI (GPT)	GPT-5.5 (xhigh)	40.7 t/s	49.1s	🐢 Deliberate
DeepSeek	DeepSeek V4 Pro (Reasoning Max)	50 t/s	1.8s	⚡ Low TTFT

Season stat: Gemini 3.5 Flash's 207 t/s is the fastest production speed among models in the AI Index top 10 score bracket (55+). Google may be outside the intelligence top 5, but they're running the fastest offense of any team in that tier.

Grok's quiet hustle: Grok 4.3 (high) at 182 t/s with 12.5s TTFT punches above its Intelligence Index rank of 53. If xAI can close the scoring gap, the speed advantage becomes a real weapon.

Business data dashboard panel showing comparative performance metrics — Speed and throughput metrics: multi-dimension performance dashboard 1

💰 Pricing war analysis — the cost-per-point breakdown

The real competition is in the value tier — Intelligence Index points per dollar of API cost.

Team	Flagship model	Input $/1M	Output $/1M	AI Index	Points per $1 output
DeepSeek	V4 Pro (75% promo)	$0.435	$0.87	52	59.8
Google	Gemini 3.5 Flash	$0.30*	$2.50*	55	22.0
xAI	Grok 4.3	$1.25	$2.50	53	21.2
Google	Gemini 2.5 Pro	$1.25	$10.00	35	3.5
Anthropic	Claude Sonnet 4.6	$3.00	$15.00	52	3.5
OpenAI	GPT-5.5	$5.00	$30.00	60	2.0
Anthropic	Claude Opus 4.8	$5.00	$25.00	61	2.4

Gemini 2.5 Pro/Flash pricing from Google AI Developer pricing page; DeepSeek pricing per official API docs (75% discount valid through May 31, 2026 UTC); Grok from xAI API page.

2 3 4 5 6

DeepSeek's fire sale

DeepSeek's V4 Pro is currently on a 75% promotional discount — output at $0.87/1M versus the post-promotion rate of $3.48/1M. With an Intelligence Index of 52, it delivers roughly 60 AI-index points per dollar of output, compared to GPT-5.5's 2.0. This promo expires May 31, 2026. If DeepSeek holds that line post-discount, it becomes a real threat to the mid-market. If it reverts to full price, the equation changes entirely.

The macro pattern: Google and DeepSeek are playing a volume game. Gemini 3.5 Flash at $0.30/$2.50 in/out and Gemini 2.5 Flash at $0.30/$2.50 are the lowest-priced capable models from any named team. Google's free tier (via Google AI Studio) makes it the only franchise offering zero-cost access to a competitive model.

Speedometer dashboard — tokens per dollar value race among AI model API tiers — API pricing race: value-per-token is the new speed stat — DeepSeek's 75% promo ends May 31 4

📊 Context window depth chart

Team	Model	Context window	Field advantage
Meta (Llama)	Llama 4 Scout	10M tokens	Widest context in the league
xAI (Grok)	Grok 4.20 / 4.1 Fast	1M tokens	Second-widest
Google (Gemini)	Gemini 1.5 Pro (May)	1M tokens	Third-widest
Anthropic (Claude)	Claude Opus 4.8	1M tokens	Tied third
OpenAI (GPT)	GPT-5.5	922k tokens	Slight edge over prior 128k window
DeepSeek	DeepSeek V4 Pro/Flash	1M tokens	Full million

Meta's Llama 4 Scout at 10 million tokens is the long-haul freight runner of this league — no other model is within 10x of its context capacity. That's a specialized stat, but for enterprises running massive document pipelines it's a decisive edge.

🆕 Challenger watch — challengers making noise this cycle

Kimi K2.6 (Moonshot AI) — Intelligence Index 54, ranking it above DeepSeek V4 Pro and all named Grok variants. Priced at $0.70/1M blend. An unseeded challenger crashing the main draw.
Mistral Medium 3.5 — AI Index 39, 152 t/s, priced at $2.10/1M blend. Solid mid-tier performance, especially for EU-regulated deployments.
Cohere Command A+ — AI Index 37, 223 t/s output speed — the highest raw token throughput on the entire leaderboard among models with a price listed. Zero-cost pricing (free tier). Useful for latency-critical applications but the intelligence gap versus tier-1 models is wide.

🏁 Postgame summary

Claude Opus 4.8 holds the scoring record. GPT-5.5 stays in contention on depth and ecosystem. Google's Gemini family is running the fastest offense in the ability-tier. DeepSeek's discount is the price-war story of the cycle — $0.87/1M output won't last past May 31. Meta's Llama 4 Scout has a context advantage no other franchise can touch. Grok is faster than its rankings suggest.

The teams that should be watching? Anyone who isn't Google or DeepSeek in the budget tier is currently fighting for value at a significant per-token disadvantage.

Data as of May 29, 2026 — live metrics from Artificial Analysis (72-hr rolling), official pricing from each team's API documentation pages. Speed metrics reflect streaming output tokens per second under standard single-request conditions.

#AILeague