AI League — Game Day 8: Grok Goes Turbo, Claude and GPT Cool Down

Grok 4.3 rockets to 174.8 t/s — stealing the speed crown from Flash. Claude and GPT-5.5 pull back from yesterday's highs. Intelligence board stays locked at 61. Full June 5 stats. #AILeague

The lead stat

Yesterday, the two title contenders were putting up numbers. Today, the speed board reshuffled. Grok 4.3 jumped to 174.8 t/s — up 27.3 t/s from yesterday's 147.5, a single-session gain of +18.5%. That's enough to overtake Gemini 3.5 Flash (180.8 → 174.8 at close of play) as the fastest reasoning-capable model in the top half of the table. Flash is still slightly quicker on the clock, but Grok is now in the same category for the first time 1.

At the same time, Claude and GPT-5.5 — coming off a combined surge that put GPT at a season-high 68.2 t/s just 24 hours ago — both gave back ground. Claude dropped 3.9 t/s to 59.8 2. GPT-5.5 pulled back 3.5 t/s to 64.7 3. Both remain comfortably ahead of their season averages, and the intelligence board didn't move. This reads less like regression and more like the top two franchises rotating their rotation.

Intelligence board — June 5

AI model intelligence index leaderboard with ranking scores and speed data — Artificial Analysis Intelligence Index — rankings, speed, and pricing across 395+ models 4

Rank	Team	Model	AI Index	Δ Index
🥇 1	Anthropic	Claude Opus 4.8 (Max)	61	↔
🥈 2	OpenAI	GPT-5.5 (xhigh)	60	↔
3	OpenAI	GPT-5.5 (high)	59	↔
4	Anthropic	Claude Opus 4.7 (Max)	57	↔
5	Google	Gemini 3.1 Pro Preview	57	↔
—	Challenger	Kimi K2.6	54	↔
6	xAI	Grok 4.3 (high)	53	↔
7	DeepSeek	DeepSeek V4 Pro (Max)	52	↔

Source: 4

The board has been static for eight straight sessions. Anthropic holds the title with a 1-point buffer over OpenAI; both lead Google's best entry (Gemini 3.1 Pro Preview, tied 4th-5th at 57) by four points.

Speed panel — June 5

Claude Opus 4.8 detail — speed, pricing, and benchmark breakdown 2

Model	Speed (t/s)	Δ vs June 4	Tier
Gemini 3.5 Flash (high)	180.8	↓ –5.8	Flash
Grok 4.3 (high)	174.8	↑ +27.3 🔥	Premium
Gemini 3.1 Pro Preview	138.1	↔	Pro
GPT-5.5 (xhigh)	64.7	↓ –3.5	Elite
Claude Opus 4.8 (Max)	59.8	↓ –3.9	Elite
DeepSeek V4 Pro (Max)	52.3	↓ –1.4	Mid
Kimi K2.6	44.1	—	Challenger

Grok's 174.8 t/s is the highest reading for any xAI model this season. For context: Gemini 3.5 Flash — still the league's fastest — is now only 6 tokens per second ahead of a reasoning model that costs $0.64/M blended 1 5. That gap was 39 t/s at the start of the week.

Pricing war breakdown

Grok 4.3 detail — today's speed surge to 174.8 t/s and $0.64/M pricing 1

No price changes today. Snapshot as of June 5:

Team	Model	Input	Output	Blended (7:2:1)
Anthropic	Claude Opus 4.8 (Max)	$6.25	$25.00	$4.10
OpenAI	GPT-5.5 (xhigh)	$5.00	$30.00	$4.35
Google	Gemini 3.1 Pro Preview	$2.00	$12.00	$1.74
Google	Gemini 3.5 Flash (high)	$1.50	$9.00	$1.31
xAI	Grok 4.3 (high)	$1.25	$2.50	$0.64
DeepSeek	DeepSeek V4 Pro (Max)	$0.44	$0.87	$0.18

DeepSeek's $0.18 blended floor remains one of the league's most stubborn price anchors — now 22 days post-promo with no sign of revision 6. The top-2 club (Claude + GPT-5.5) costs roughly 4x Gemini Pro and 6x Grok for raw API calls. Grok's output price of $2.50/M stays the most aggressive in the elite speed tier.

Challenger watch

Kimi K2.6 holds at AI Index 54, a position that still sits one notch above Grok (53) and two above DeepSeek (52) on intelligence alone. Speed is the constraint: 44.1 t/s puts it in the bottom tier of the tracked field. Blended price is $0.70/M — neither cheap enough to threaten DeepSeek on value nor fast enough to threaten Grok on throughput 7.

MiMo-V2.5-Pro, also at 54 on the index per the overall leaderboard, enters as a co-top open-weights entry 4.

Analyst call

Grok's speed spike is real — not noise. Three consecutive days of sub-150 t/s readings made 174.8 stand out, and the delta (+27.3) is larger than any single-session move from any model this week except GPT-5.5's +16.1 surge on June 1. The question now: can xAI hold that output rate, or will it drift back toward the 147–150 band it's occupied most of the season?

Claude's intelligence lead at 61 is eight sessions old. The number that matters next is whether Anthropic can extend it — or whether OpenAI finally puts GPT-5.5 on the same tier. At a 1-point gap and roughly equivalent pricing, anything under four points counts as a dead heat in most practical deployment decisions.

Data sourced from Artificial Analysis live model benchmarks. Speed figures represent output tokens/sec on first-party APIs at time of collection. Blended pricing at 7:2:1 cache-hit/input/output ratio.

#AILeague