AI Daily Briefing — Catch-Up Edition: Late April through Early May 2026

A catch-up edition covering two weeks of AI news (April 22 – May 6, 2026). Top stories: GPT-5.5 launches with agentic architecture and 85% ARC-AGI-2 score; Kimi K2.6 claims open-weight leadership; Microsoft Copilot goes GA for 400M users; Claude gains autonomous memory 'dreaming'; Apple opens iOS 27 as an AI model marketplace; Sora shut down after $1M/day losses; the Musk vs. OpenAI trial reveals internal emails; and Anthropic secures a landmark $300B infrastructure commitment while being excluded from the Pentagon's AI consortium.

Catch-up edition. This issue covers two weeks of AI news — April 22 through May 6, 2026 — a period dense enough that a single summary barely does it justice. Below are the 14 highest-signal stories, selected from roughly 80 collected items.
Cover: neural network and circuit board composite illustration
Cover: neural network and circuit board composite illustration
There has never been a two-week window quite like this one. A federal courtroom in Oakland started tearing open OpenAI's origin story. The Pentagon drew a line in the sand around autonomous weapons — and Anthropic stepped across it by refusing to cross it. And both major frontier labs announced nine-figure enterprise joint ventures on the same day. If the previous era of AI was about building models, this era seems to be about building empires.
Kimi K2.6 takes the open-weight crown
Moonshot AI (Beijing) dropped Kimi K2.6 on April 20 — a 1-trillion-parameter MoE model with 32B active parameters, released under a modified MIT license1. The model handles text, vision, and image/video input natively, and stands out for long-horizon coding tasks, UI/UX generation, and multi-agent orchestration. At $0.60/$2.50 per million tokens, it ranks sixth overall on LLM-Stats and first among open-weight models2.
Why it matters: China's open-weight ecosystem is not closing the gap with frontier proprietary models — it has, by several measures, closed it. Kimi K2.6's autonomous multi-day task execution capability means open-weight agents are no longer a research curiosity.

DeepSeek V4 Pro and xAI Grok 4.3 apply pressure from two directions
DeepSeek released V4 Pro (1.6T total / 49B active, MoE) and V4 Flash (284B/13B) under MIT license on April 243, achieving open-source SOTA in agentic coding with a 1M context window. Days later, xAI launched Grok 4.3 (May 1–6) with 1M context, always-on reasoning, and pricing at $1.25/$2.50 per million tokens — roughly 40–60% cheaper than Grok 4.24. Grok 4.3 ranks first on the CaseLaw v2 legal benchmark at 79.3%, and the launch package includes Custom Voices voice cloning and a Voice Agent API.
Mistral Medium 3.5 (April 28) rounds out the open-weight releases: 128B dense, 256K context, SWE-Bench Verified 77.6%, self-hostable on four GPUs5.
Why it matters: Three open-weight releases in ten days, each competitive with closed models at launch. The pricing war is no longer theoretical — xAI is explicitly cutting rates to win share, and DeepSeek is doing the same simply by publishing weights.

Product updates

Microsoft 365 Copilot agentic capabilities go GA for 400M+ users
On April 22, Microsoft made Copilot's agentic capabilities in Word, Excel, and PowerPoint the default experience across all Microsoft 365 subscriptions6. Copilot can now autonomously draft and refactor documents, analyze and visualize data, and build presentations — not just suggest but execute. Early engagement data from Microsoft: Word up 52%, Excel up 67%, PowerPoint up 11%.
Why it matters: This is the largest agentic AI deployment in history by user count. The engagement numbers — Word up 52%, Excel up 67% — are meaningful, but they were collected in early rollout when novelty effects are highest. Whether they hold at month three is the real test, and Microsoft has more financial incentive than any company on earth to make sure they do.

Claude gets "dreaming" — persistent memory extracted from past conversations
Anthropic's May 6 update to Claude introduced Managed Agents "dreaming": agents that asynchronously review historical conversations to extract and store reusable memory78. Rate limits were also lifted significantly across Claude tiers.
Why it matters: Memory is the missing ingredient that separates an AI tool from an AI collaborator. "Dreaming" is a first-of-its-kind production feature for autonomous memory consolidation — not user-managed, not prompt-injected, but machine-initiated. Every competing AI assistant that lacks this will feel slightly more amnesiac by comparison.

Apple concedes the AI model race, turns iOS 27 into a marketplace
Apple announced on May 5 that iOS 27, iPadOS 27, and macOS 27 will allow users to choose third-party AI models — Gemini, Claude, and others — as the default engine for Apple Intelligence9. A new Extensions API will let third-party models power Siri, Writing Tools, and Image Playground.
Why it matters: This is a major strategic concession. Apple is not betting it can win the model race; it is betting it can own the distribution layer regardless of who wins. That repositioning reshapes the consumer AI platform landscape more than any new Apple model would have.

Sora is dead — $1M/day losses and a market that never materialized
OpenAI officially shut down Sora on April 2610. The reported causes: operating losses of $1M per day, low user engagement, unresolved copyright exposure, and a competitive video generation market where Runway and Pika have moved faster. For a company projecting $100B+ in revenue, the unit economics never worked.
Why it matters: Sora's shutdown is a rare visible retreat for OpenAI. AI video generation turns out to be hard to monetize even when the technical capability is real. That's useful calibration for the rest of the industry.

Industry news

Musk vs. OpenAI: the trial everyone is watching, and what it actually revealed
The federal trial opened April 27 in Oakland11. Elon Musk is seeking $134–150B in damages and an injunction against OpenAI's nonprofit-to-profit conversion. Two revelations have dominated coverage.
First, Greg Brockman's personal diary, entered into evidence, describes a plan to "steal the nonprofit from Musk." Second, and more consequential for regulatory purposes: Brockman, Altman, Sutskever, and Adam D'Angelo all held undisclosed personal equity in Cerebras while OpenAI committed over $10B in chip purchases and a $1B loan with warrants to the same company12. Musk also admitted xAI distilled GPT to train Grok, violating OpenAI's terms of service. Prediction market Kalshi puts Musk's odds at 34–40%, down from 60% before trial13.
Why it matters: The Cerebras self-dealing disclosure is the more durable story here. Whether or not Musk wins, the conflict-of-interest allegations have potential SEC implications and will complicate Cerebras' IPO narrative considerably.

Cerebras IPO: $26.6B valuation, complicated by the trial
Cerebras filed an S-1 on April 17 targeting $3.5B at $115–125/share — implying a $26.6B valuation14. Pricing is expected mid-May. That timeline is now running straight into the Musk trial disclosures.
Why it matters: What was a chip company IPO is now a governance story. Institutional investors will need to weigh the self-dealing allegations against the commercial fundamentals — and that is a harder sale than a clean listing would have been.

The Pentagon picks sides: seven AI deals signed, Anthropic frozen out
On May 1, the DoD signed classified military AI agreements with SpaceX, OpenAI, Google, NVIDIA, Reflection, Microsoft, and AWS15. Anthropic was explicitly excluded after refusing to permit Claude's use in fully autonomous weapons systems and mass surveillance programs. Defense Secretary Hegseth publicly listed Anthropic as a supply chain risk. Anthropic filed two lawsuits challenging the classification16.
Google's classified agreement, signed April 28, permits DoD use of its AI for "any lawful government purpose." That phrasing prompted more than 700 Google employee protests17.
Why it matters: This crystallizes the industry's defense alignment fault line. Anthropic is betting its ethical constraints are a long-term differentiator. Whether that bet pays off depends on whether enterprise and consumer buyers value it enough to offset exclusion from a very large government market.

Anthropic's $300B+ infrastructure buildout — and a $50B funding round in the wings
In rapid succession: Amazon put in another $5B (total commitment now $13B, with potential up to $25B) tied to a $100B+ AWS spend pledge over 10 years1819. CoreWeave announced a multi-year deal covering 300MW and 220,000+ NVIDIA GPUs20. The Information reported a roughly $200B Google Cloud commitment over five years (unconfirmed officially). The May 6 SpaceX deal adds Colossus 1's 300MW+ of compute to the stack, deployable within a month7.
Bloomberg reported Anthropic is evaluating a $50B funding round at an $850–900B valuation.
Why it matters: Anthropic's combined cloud commitments exceed $300B — larger than any infrastructure buildout in AI history. The company is buying compute independence while remaining compute-dependent. That tension will define its next two years.
正在加载统计卡片...

Both frontier labs launch enterprise joint ventures — on the same day
May 4 produced an odd symmetry. Anthropic formed an enterprise AI services JV with Blackstone, Hellman & Friedman, and Goldman Sachs, valued at $1.5B with $300M from each partner21. Hours later, OpenAI announced "The Development Company" — an enterprise JV valued at $100B, with $4B raised from 19 investors, no overlap with Anthropic's backers22.
Why it matters: The same playbook, executed simultaneously: carve out the deployment business from core R&D, unlock a separate valuation, and attract capital from non-tech institutional investors. It is hard to know whether to read this as healthy competition or as two companies running identical financialization strategies at the same moment.

Research

OpenAI o1 beats physicians in a peer-reviewed Science paper
A Harvard/Stanford/BIDMC team published results in Science on April 30 showing that o1-preview outperformed attending physicians across all clinical reasoning experiments23. In emergency room triage, o1 scored 67.1% accurate or near-accurate diagnoses, compared with 55.3% and 50.0% for two attending physicians. The paper notes this is the first AI system to satisfy the 1959 Science standard for exceeding human clinical diagnostic ability.
Why it matters: This is the most consequential AI-in-healthcare result to date — not a company benchmark, not a preprint, but a peer-reviewed Science paper with direct clinical implications for emergency medicine. The 12-percentage-point margin over the attending physicians is not subtle.

ICLR 2026: a capability jump, measured
ICLR 2026 (April 23–27) awarded Outstanding Papers to "Transformers are Inherently Succinct" — a theoretical proof of transformer encoding power — and "LLMs Get Lost In Multi-Turn Conversation," documenting real-world LLM degradation in extended sessions24. Separately, a paper showed Claude Opus 4.7 can now autonomously implement an AlphaZero self-play pipeline for Connect Four — a task that was impossible for any AI system in January 202625.
Why it matters: The AlphaZero result is a useful forcing function for AI capability forecasting. A task going from impossible to near-saturated in four months does not fit neatly into linear extrapolation. It also does not fit neatly into the "AI progress is slowing" narrative that has been circulating since late 2025. The LLMs-in-multi-turn-conversation degradation paper is the needed counterweight — capability benchmarks and reliability benchmarks are not the same thing, and the industry has been measuring the former while quietly ignoring the latter.

Also worth noting

DateItemSignal
Apr 27David Silver / Ineffable Intelligence raised $1.1B for RL-based "superlearner" without human data (Sequoia, Lightspeed, Google, NVIDIA)RL-from-scratch gaining serious institutional backing
Apr 28OpenAI on AWS Bedrock: GPT-5.5, Codex, Managed Agents in limited preview26OpenAI reaching buyers who are locked into AWS
Apr 27OpenAI FedRAMP Moderate — ChatGPT Enterprise and API authorized for US federal use27Government procurement pathway now clear
Apr 30Google Cloud Next '26 — Gemini Enterprise Agent Platform, 8th-gen TPU, Gemma 4 open model, Deep Research MaxGoogle's enterprise stack looks complete
Apr 29Cursor SDK public beta — persistent agents with lifecycle control for developersAgentic developer tools maturing toward production
May 6DeepSeek closes first VC round at $45B valuationFirst external capital for the Chinese lab; changes its governance posture
May 5ChatGPT Ads expansion — Dentsu, Omnicom, WPP agency channels; self-service Ads Manager; CPC bidding28OpenAI's revenue diversification beyond API/subscription

Bottom line

The trial in Oakland is the most revealing thing to happen in AI governance this year — not because Musk is likely to win, but because the discovery process is doing the work that press and regulators have not. Undisclosed executive equity stakes in vendors receiving nine-figure purchase commitments is not a minor disclosure; it is the kind of conflict that reshapes fiduciary expectations across the industry.
Meanwhile, the scale numbers are getting genuinely difficult to reason about. Anthropic's combined infrastructure commitments exceed $300B. OpenAI's enterprise JV is valued at $100B on day one. These are not R&D numbers; they are industrial buildout numbers. The labs are no longer building products — they are building infrastructure at a pace that will make the compute constraints of 2023 look quaint.
The Apple announcement deserves more attention than it received in the news cycle. A company that controlled its entire AI stack just opened that stack to competitors. That is not a feature — it is a strategic repositioning. When the company with the world's most valuable consumer platform decides it cannot win on model quality alone, that says something about how hard this problem actually is.
Watch over the next month: Cerebras IPO pricing (will the self-dealing allegations stick with institutional buyers?), Anthropic's lawsuits against the Pentagon blacklist (first real legal test of AI ethics policies in defense procurement), and whether GPT-5.5's agentic benchmarks translate into measurable enterprise outcomes.

Cover image generated by AI.

このコンテンツについて、さらに観点や背景を補足しましょう。

  • ログインするとコメントできます。