AI League — Game Day 11: Grok Clears 207 t/s, Gemini Flash Gives Chase in Speed Dead Heat

AI League — Game Day 11: Grok Clears 207 t/s, Gemini Flash Gives Chase in Speed Dead Heat

Grok 4.3 hits 207.6 t/s — 5th straight record day, +43% since season open. Gemini 3.5 Flash answers at 206.8 t/s, turning the speed crown into a 0.8 t/s photo finish. GPT-5.5 bounces back 3.3 t/s. Intelligence board locked at 61 for 11 days. #AILeague

AIL·Stats Board
2026. 6. 8. · 08:37
구독 1개 · 콘텐츠 11개
Grok 4.3 clears 207 t/s — new season-high — as Gemini 3.5 Flash matches pace within 0.8 t/s, turning the speed crown into a two-horse race. GPT-5.5 bounces back 3.3 t/s after back-to-back declines. Intelligence board stays frozen for an 11th consecutive day. Full June 8 stats. #AILeague

Intelligence board

The scoreboard hasn't shifted since Season Opening Night. Claude Opus 4.8 sits at 61 pts for an 11th straight day, GPT-5.5 at 60, with no new model reaching the 60-tier. The frozen board is less a sign of stability than a sign of how hard it is to break through: reaching 61 means outperforming 395 other benchmarked models across 10 independent evaluations.
RankTeamModelAI IndexΔ Day
1AnthropicClaude Opus 4.8 (Max)61
2OpenAIGPT-5.5 (xhigh)60
3GoogleGemini 3.1 Pro Preview57
4KimiKimi K2.654
5xAIGrok 4.3 (high)53
6DeepSeekDeepSeek V4 Pro (Max)52
차트를 불러오는 중…
1
Kimi K2.6 remains the highest-ranked open-weights model — above every Grok variant and above DeepSeek on the intelligence board, even as it sits near the bottom of this core-six snapshot.

Speed panel

This is where today gets interesting.
통계 카드를 불러오는 중…
Grok 4.3's 11-day speed run, from Season Opening Night (145.2 t/s) to today (207.6 t/s):
차트를 불러오는 중…
Grok 4.3 hit 207.6 t/s — its fifth consecutive record high and a 5% single-day gain over Day 10's 197.7 t/s. Since Season Opening Night, xAI has added 62.4 t/s to its output speed, a +43% run in 11 days. 2
The story isn't just Grok's run, though. Gemini 3.5 Flash checked in at 206.8 t/s — within 0.8 t/s of Grok, essentially tied. Google is running two sub-arms of its speed strategy: Flash at 207 t/s for throughput-sensitive workloads, and Pro Preview at 142.6 t/s for the accuracy-conscious tier. Together, Google fields the deepest speed bench in the league. 3 4
GPT-5.5 bounced from 61.7 to 65.0 t/s — recouping about half of its two-day slide. Not a breakout, but at least the bleeding stopped. 5
Claude Opus 4.8 held at 66.8 t/s, barely ahead of GPT-5.5 and roughly on pace with where it opened the season. 6

Pricing war breakdown

No price moves today. The gap between the top-intelligence tier and the value tier remains wide, and that's by design.
TeamModelInputOutputBlended
AnthropicClaude Opus 4.8 (Max)$6.25$25.00$4.10
OpenAIGPT-5.5 (xhigh)$5.00$30.00$4.35
GoogleGemini 3.1 Pro Preview$2.00$12.00$1.74
GoogleGemini 3.5 Flash (high)$1.50$9.00$1.31
KimiKimi K2.6$0.95$4.00$0.70
xAIGrok 4.3 (high)$1.25$2.50$0.64
DeepSeekDeepSeek V4 Pro (Max)$0.435$0.87$0.18
1
Grok 4.3 at $0.64/1M blended is the most cost-efficient proprietary reasoning model in the top-53 intelligence tier — speed-and-price combined, xAI is posting arguably the best value proposition for throughput-heavy workloads. DeepSeek V4 Pro at $0.18 blended remains the outright cheapest at AI Index 52, but the open-weights team's 59.7 t/s is the slowest in this cohort.
The real squeeze is on Anthropic and OpenAI: both are paying a premium-price tax to hold the top-two intelligence slots, while Google is offering 57-point intelligence (Gemini Pro) at 60% of their price — and 55-point intelligence (Flash) at roughly 30% of their price with twice the speed.

Challenger watch

MiMo-V2.5-Pro entered the AA index at 54 pts — tied with Kimi K2.6 as the second-highest open-weights score on the board. If MiMo sustains that rating across reruns, it becomes the first new open-weights model this season to match Kimi's standing. No speed or pricing data was available for MiMo in today's snapshot. 1

Stat of the day

Grok 4.3 is now generating tokens 3.1× faster than Claude Opus 4.8 (207.6 vs 66.8 t/s) while scoring only 8 intelligence points lower (53 vs 61). For latency-sensitive applications, that trade-off math is hard to ignore.

Stats sourced from Artificial Analysis live API measurements. Intelligence Index v4.0 incorporates 10 evaluations. Speed figures reflect first-party API performance. Prices in USD per 1M tokens (blended at 7:2:1 cache-hit/input/output ratio).
#AILeague

이 콘텐츠를 둘러싼 관점이나 맥락을 계속 보강해 보세요.

  • 로그인하면 댓글을 작성할 수 있습니다.