Issue 04: Billing Day — Copilot Credits Go Live, Opus 4.8 Arrives, and Microsoft Pulls Claude Code

Issue 04: Billing Day — Copilot Credits Go Live, Opus 4.8 Arrives, and Microsoft Pulls Claude Code

Five developments reshaping AI engineering costs and tooling: GitHub Copilot's usage-based billing launches today with a new Max plan and flex allotments; Anthropic ships Claude Opus 4.8 with Dynamic Workflows (1,000 parallel subagents) and a 4× honesty improvement; Microsoft cancels Claude Code licenses across its Experiences + Devices division by June 30; LangGraph v0.4.0 hardens WebSocket streaming; and Pydantic AI adds Opus 4.8 support alongside a NAT64 SSRF security fix.

AI Product Engineer Day by Day
2026. 6. 1. · 08:04
구독 7개 · 콘텐츠 3개
The billing clock ran out today. As of June 1, every GitHub Copilot user on a monthly plan is now on usage-based billing — premium request units are gone, GitHub AI Credits are live, and the math that engineers have been running on annual plans is about to get messier. That transition sits at the center of a week that also saw Claude Opus 4.8 ship with a native parallel subagent runtime, Microsoft pull Claude Code licenses from thousands of its own engineers, and LangGraph + Pydantic AI each push releases that quietly improve the open-source agentic stack.
The throughline: the cost of agentic AI work is being repriced, rebundled, and in some cases, centrally controlled — all at once.

1. GitHub Copilot's billing model just went live

The June 1 switch was announced in April, and today it's real. Monthly Pro and Pro+ subscribers have been auto-migrated; annual subscribers stay on premium request units until their plan expires, then convert.1
The base plan prices didn't change — but GitHub updated the included usage math in May after hearing pushback about whether $10/month would stretch far enough for agentic sessions.2 The current individual lineup, as of today:
PlanPriceBase creditsFlex allotmentTotal included
Free$0Limited completions only
Pro$10/mo$10$5$15
Pro+$39/mo$39$31$70
Max$100/mo$100$100$200
The flex allotment is variable by design — GitHub says it will adjust as model pricing and efficiency evolve. Base credits are fixed 1:1 with subscription price. Code completions and next-edit suggestions remain unlimited on paid plans and don't consume credits.
For Business and Enterprise, the math is different. Existing customers get a promotional uplift through August: Business goes from $19 to $30 in monthly credits; Enterprise from $39 to $70.1 The stated rationale is to give teams time to instrument their actual consumption before the promotions drop back to base rates in September.
What this means operationally: annual plan subscribers are on a countdown. Their current pricing holds until expiration, then they move to monthly or downgrade to Free. Model multipliers are increasing for annual subscribers starting today.1 Teams with renewals coming in Q3 or Q4 should model their June/July consumption against the promotional credit amounts — that data will tell them whether the base rates are enough before the buffer disappears.

2. Claude Opus 4.8 ships — 41 days after 4.7, same price, significantly different capability

Anthropic released Opus 4.8 on May 28, priced identically to 4.7: $5 per million input tokens and $25 per million output tokens. The fast mode (2.5× speed) dropped to three times cheaper than before.3
The pace matters as much as the model. 41 days between major Opus versions is faster than most production teams were planning for, and Anthropic immediately removed Opus 4.6 from the model selector — no soft sunset, just gone.4 Teams with production pipelines on 4.6 have no rollback path.
The capability changes worth tracking:
Honesty and reliability: Opus 4.8 is approximately 4× less likely than 4.7 to allow flaws in code to pass unremarked — it flags uncertainties rather than confidently proceeding.3 For long-running agentic tasks where a wrong turn at step 3 compounds through step 30, this matters more than benchmark numbers.
Effort Modes: Users can now dial reasoning depth from low to max per request. High effort is the default; "extra" and "max" cost more tokens but produce better results on difficult or async workflows. Low effort runs faster and preserves rate limits. This replaces the previous pricing-tier model as the cost-quality dial.3
Dynamic Workflows (research preview): Available in Claude Code for Enterprise, Team, and Max plan subscribers, this feature lets Claude plan a task and spawn hundreds of parallel subagents. Hard limits: 1,000 total subagents per session, 16 concurrent. Workflow state lives in JavaScript variables outside Claude's context window. Anthropic cites a codebase-scale migration across hundreds of thousands of lines of code as the reference use case.3
Messages API change for developers: System entries can now be injected mid-task inside the messages array without breaking the prompt cache. This lets agentic harnesses update permissions, token budgets, or environment context while a run is in flight — useful for multi-phase tasks where scope changes between steps.3
On SWE-bench Pro, Opus 4.8 scores 69.2% vs 64.3% for 4.7, and tops GPT-5.5 by over 10 points.4 Browser agent (Online-Mind2Web): 84%. Dynamic Workflows is still research-preview with unstable APIs, so building production multi-agent products directly on top of it carries breakage risk before GA.
Opus 4.8 benchmark comparison table against Opus 4.7 and competing models
Opus 4.8 vs. Opus 4.7 and GPT-5.5 across coding, agentic, and reasoning benchmarks 3
Opus 4.8 is also now available in GitHub Copilot (Pro+, Business, Enterprise) across VS Code, JetBrains, Xcode, and the CLI — with a 15× premium request multiplier that was in effect until the usage-based billing launched today.5
GitHub Copilot model selector showing Claude Opus 4.8 available alongside GPT-5.5 and Gemini 3 Pro
Claude Opus 4.8 in the GitHub Copilot model selector, available since May 28 5

3. Microsoft pulls Claude Code licenses — the cost arc closes

The Uber story from Issue 02 now has a sequel at Microsoft's own front door. As of May 31, Microsoft's Experiences + Devices division — Windows, Microsoft 365, Outlook, Teams, Surface — has been told to stop using Claude Code by June 30, 2026.6
Rajesh Jha, EVP of Experiences + Devices, framed it partly as a product alignment story: GitHub Copilot CLI can be shaped for Microsoft's own repos, security expectations, and engineering workflows. But the timing — last day of Microsoft's fiscal year — is not coincidental.6
The cost structure is the real story. Claude Code's enterprise pricing: $20/seat/month base plus actual API token usage. GitHub Copilot Enterprise: $39/seat/month flat, no usage surcharge.6 When engineers run agentic sessions — which they do, given Claude Code's SWE-bench score of 80.8% on complex multi-file tasks — the variable component can dominate total cost at any meaningful scale.
Uber's numbers from earlier this year put a concrete floor on this: average monthly spend ran $150–$250 per engineer, with heavy users hitting $500–$2,000. Uber's CTO ran up $1,200 in a single two-hour session.6 These are not edge cases for teams where the tool is actually working.
The pattern that's emerging: at enterprise scale, capability-per-session is not the bottleneck — predictable cost accounting is. Flat-rate tools survive budget cycles that token-metered tools struggle to justify to finance teams, regardless of which model scores higher on benchmarks.

4. LangGraph v0.4.0 + v1.2.2: WebSocket streaming and checkpoint safety

Two LangGraph releases shipped in the same week, targeting the infrastructure layer that agentic applications run on.
LangGraph SDK v0.4.0 (May 28): A streaming overhaul. The release adds WebSocket stream transports, hardened reconnects, sync subgraph handles, and shared stream subscriptions.7 For applications that run long agent sessions over unreliable connections, this is the difference between a session that recovers and one that silently fails. The v3 streaming primitives and SSE transport are new foundations, not surface features.
콘텐츠 카드를 불러오는 중…
LangGraph v1.2.2 (May 26): A targeted bug fix — BaseMessage IDs that arrive as None now get stable IDs assigned before DeltaChannel checkpoint writes.7 The failure mode this prevents: checkpoint corruption when messages without explicit IDs pass through a delta checkpoint write. Not a flashy release, but silent state corruption in long-running workflows is precisely the class of bug that surfaces in production rather than in tests.
The two releases together — v0.4.0's transport layer hardening and v1.2.2's checkpoint fix — read as a concerted push toward production reliability rather than new capability. If you're running LangGraph in anything close to production, both are worth pulling.

5. Pydantic AI: Opus 4.8 support, v2 beta track, and a security fix

Pydantic AI shipped two releases on May 28, covering both its production line and its v2 preview track.
v1.104.0 (stable): Adds Claude Opus 4.8 support and fixes a Bedrock single-tool tool_choice cache retention bug.8
v2.0.0b4 (beta): The V2 track consolidates v1.103.0 and v1.104.0 changes. Additional features include list_prompts / get_prompt for McpServer, UIMessage.metadata timestamp round-trip via VercelAIAdapter, and anthropic_eager_input_streaming support on OpenRouterModel.8
An earlier release (v2.0.0b3 / v1.102.0, May 22) patched a security issue (GHSA-cg7w-rg45-pc59): IPv6 NAT64 transition address formats could bypass the SSRF cloud metadata blocklist.8 The affected scenario is narrow — requires force_download='allow-local' on FileUrl with attacker-influenced input and a NAT64 or ISATAP network configuration — but if any of those apply, the patch is mandatory.
The v2 beta track is progressing steadily. The API is still in motion (the tool prepare-callback None return now throws TypeError instead of a deprecation warning), so v2 is not yet a production recommendation, but the cadence suggests GA is not far off.

What to watch

Copilot credit consumption data, June vs. July: The June/July promotional uplift for Business ($30) and Enterprise ($70) serves as a real-world baseline test before September's reversion to standard rates. Teams should be pulling usage reports now. GitHub made April reports available on May 12 as preparation material — monthly statements going forward are the instrument for catching surprises before they turn into Q3 budget conversations.9
Claude Mythos and the capability ceiling above Opus 4.8: Anthropic confirmed Mythos-class models are in limited preview via Project Glasswing for cybersecurity work, with broad availability in "the coming weeks."3 IBM's Project Lightwell has already used a Mythos Preview to flag nearly 3,900 critical open-source vulnerabilities.4 When Mythos goes public, the Effort Mode + Dynamic Workflows architecture in Opus 4.8 is likely the same scaffolding it runs on.
The flat-rate vs. metered market split: Microsoft's move reinforces what Uber's budget implosion showed in April. The enterprise AI coding market is splitting between metered tools (high capability, unpredictable at scale) and flat-rate tools (bounded cost, faster procurement). GitHub Copilot's usage-based model is metered — but with budget controls and pooled org credits, it's trying to give enterprises the spending floor they need. Whether those controls are sufficient for high-usage orgs will be visible in Q3 renewal conversations.

이 콘텐츠를 둘러싼 관점이나 맥락을 계속 보강해 보세요.

  • 로그인하면 댓글을 작성할 수 있습니다.