AI Coding Tools Weekly: Copilot's billing clock strikes, Anthropic files S-1, and Windsurf becomes something new

Week of May 29 – June 5, 2026

June 1 was the inflection point. GitHub Copilot activated usage-based AI Credits billing across all plans simultaneously — and the developer community's reaction made it the most-discussed AI tooling event of the year so far. Twenty confirmed shipping events landed across seven tools this week: Anthropic filed a confidential S-1 and shipped seven Claude Code releases, Cursor restructured Teams pricing alongside a v3.7 release, Windsurf was rebranded as Devin Desktop with a new multi-agent protocol, OpenAI quietly removed two Codex models and started banning accounts, and Grok Build continued its daily release cadence to v0.2.20. Flat-rate pricing is officially over for the tools that matter most.

Here's what shipped.

GitHub Copilot: billing activates, backlash erupts, and features keep coming

The AI Credits switch

On June 1, GitHub Copilot replaced its premium-request counting model with AI Credits across every plan — Free, Student, Pro, Pro+, Max, Business, and Enterprise. 1 One AI credit = $0.01. Code completions and Next Edit Suggestions are exempt and remain unlimited. Everything else — chat, the cloud agent, Spaces, code review, Spark, and third-party agents — now runs on credits. Copilot Code Review consumes both AI credits and GitHub Actions minutes. 1

New user sign-ups are paused across all plans as of June 1. GitHub says registrations will reopen "in the coming weeks." 1 Existing users can upgrade to Copilot Max (higher credit pools and spending caps), but new accounts cannot be created. Enterprise admins now have user-level budget controls with email alerts at configurable thresholds. 1

Model pricing varies significantly by tier: GPT-5.5 runs $5.00/M input and $30.00/M output; MAI-Code-1-Flash is $0.75 input and $4.50 output. 2

Developer reaction

r/GithubCopilot lit up immediately. The "Death of Copilot 2026" thread hit 80 upvotes with 61 comments; Google Trends data cited in a separate post shows Copilot search interest falling to 16% of its January peak. 3 4

The specific numbers are striking. u/multiversekyle wrote: "The $10 Copilot plan used to last me most of the month on average and now it lasts 1 day of light usage under the new plan, that's quite a big jump." 3 u/isnoir: "All 12 members of our team reached the limit of the Max plan barely 5 days into the month, while being conservative with usage." [cite:5|Reddit r/GithubCopilot: End of an Era [June 1, 2026]|[https://www.[reddit.com/r/GithubCopilot/comments/1ttd1hl/]]](https://reddit.com/r/GithubCopilot/comments/1ttd1hl/]]](https://www.reddit.com/r/GithubCopilot/comments/1ttd1hl/]])) One user on the Pro+ plan described spending €40 in a few days: "A few simple prompts. A few days of development. €40 later. The numbers are merciless." 3

コンテンツカードを読み込んでいます…

Not all reactions are negative. MAI-Code-1-Flash is drawing praise as a cost-effective path through the new billing structure: user u/horendus reported using it for a full day of IoT web development, spending roughly 25% of the $10 plan over 8 hours. "Better than gpt5.4mini my other go-to cheap model. GPT mini will at times seem to lock up on a task where as MAI had no such issues all day!" 5

Julia Rapczynska of Push-based.io framed the structural shift clearly: Copilot is no longer just an autocomplete tool — it runs chats, agents, reviews, repository analysis, and multi-step workflows. "Autocomplete still feels like an editor feature, but agent behavior feels more like compute." 6 Once you accept that framing, AI Credits follows the same logic as cloud compute billing. Whether teams will accept it is a different question.

GitHub has not published any response to the community discussion thread (#197089) as of this writing.

MAI-Code-1-Flash and this week's feature launches

New Release: MAI-Code-1-Flash available for GitHub Copilot — model selection dropdown — MAI-Code-1-Flash arrives in Copilot's model selector alongside GPT-5.5 and Claude Opus 4.7 7

MAI-Code-1-Flash — Microsoft's first internally built coding model — landed in GitHub Copilot on June 2, starting with VS Code and rolling out to all plans (Free through Max). 7 At $0.75/M input and $4.50/M output it sits below Claude Opus 4.8 and GPT-5.5 in cost by a wide margin. Microsoft describes it as the highest-quality model of its size class in early testing, beating other small models. 7 GPT-4.1 was deprecated the same day. 8

The rest of the week's Copilot launches were dense with enterprise tooling, all shipping between June 2 and June 4:

1M-token context windows and configurable reasoning levels across VS Code, Copilot CLI, and the Copilot app 9
Budget and usage management APIs GA — create, update, delete budgets via API; Usage Summary API queryable by organization, repository, cost center, product, and SKU at yearly/monthly/daily granularity 10
Copilot SDK GA, Copilot Memory extended to Business and Enterprise, Cloud and Local Sandboxes public preview, cloud agent task scheduling and a /chronicle command for cross-session insights — all June 2 8
Enterprise Teams GA and Agent Tasks REST API open to Pro/Pro+/Max — June 4 8
Copilot CLI v1.0.60 (June 5): billing help topic, vim-style navigation keys (g, G, Ctrl+D, Ctrl+U) in /diff view, Ctrl+S to stash and pop the current prompt (matching a Claude Code shortcut). v1.0.59 added /voice with local speech-to-text; v1.0.58 defaulted Rubber Duck on and introduced /every and /after for prompt scheduling 11

For engineering managers with Business or Enterprise accounts: the budget management APIs GA means you can now programmatically monitor spend by team, cost center, and model — the controls are there, they just require setup.

Anthropic: S-1 filed, $65B round background, and seven Claude Code versions

On June 1, Anthropic confidentially submitted a draft Form S-1 registration statement to the SEC, initiating the IPO process. Share count and offering price are not yet determined. The announcement was published under Rule 135 of the Securities Act of 1933 — informational only, not an offer to buy or sell. 12 Financial details remain undisclosed; they will become public only after the SEC completes its review. What's already known from a May 28 announcement (one day before this window): Anthropic closed a $65B Series H at a $965B valuation, with a run-rate revenue figure above $47B per year. 13 CFO Krishna Rao named Claude Code explicitly among the products driving growth: "We work tirelessly to make tools like Claude Code and Cowork more helpful, more powerful, and more adaptable to their needs." 13

Claude Code: 7 releases, ultracode, and managed version locks

Claude Code shipped from v2.1.157 to v2.1.165 across seven versions from May 29 to June 5, reaching 130K GitHub stars (up from 128K the prior week). 14 Key changes for teams using it in managed environments:

v2.1.160 (June 2) renamed the workflow trigger keyword from the previous value to ultracode — any prompts or scripts using the old keyword need updating. The same release added acceptEdits mode protection for build-tool config files (.npmrc, .yarnrc*, bunfig.toml, and similar) and added a confirmation prompt before writing shell startup files. 14

v2.1.158 (May 30) expanded Auto mode to Opus 4.7 and Opus 4.8 running on Amazon Bedrock, Google Vertex AI, and Microsoft Azure AI Foundry (requires CLAUDE_CODE_ENABLE_AUTO_MODE=1). 14

v2.1.163 (June 4) introduced managed version locking: requiredMinimumVersion and requiredMaximumVersion settings let enterprise admins pin the Claude Code version their org uses, preventing unauthorized upgrades mid-project. The same release added /plugin list with --enabled/--disabled filters and the /btw copy shortcut (press c to copy output while preserving raw markdown). 14

v2.1.157 (May 29) enabled automatic loading of plugins from .claude/skills/ and added the claude plugin init scaffold command. 14

The JetBrains 2026 developer survey provides useful context on adoption: 46% of engineers with 10+ years of experience name Claude Code as their primary daily tool; only 9% choose GitHub Copilot. 15 The spread between those two numbers grew notably this week.

Also on June 2: Anthropic expanded Project Glasswing — its critical infrastructure security initiative — from roughly 50 to approximately 150 partner organizations across 15+ countries, adding power utilities, water systems, healthcare providers, and telecom operators. It launched Claude Security (using Opus 4.8) to scan codebases for vulnerabilities and generate patches. 16

Cursor v3.7: Design Mode goes multiselect, voice lands, and pricing restructures

Cursor shipped v3.7 in two drops this week. The June 4 drop added Design Mode support inside Canvas and an interactive context usage report that breaks down token distribution across system prompts, tool definitions, rules, and skills, with an embedded "Debug with Agent" button. 17 The June 5 drop extended Design Mode to support multi-element selection in the browser: pick two or more elements and the agent sees their code, layout relationships, and visual state together, letting you adjust a group of components in one pass. 17 Voice input is now available via the Design Mode overlay — describe changes verbally while the agent runs, and queue the next instruction before the current one finishes. 17

Enterprise Organizations reached GA on June 3: multi-team management, organization-level identity provider integration, and usage analytics are all available. Shared Canvases can now be opened full-screen in a browser and support embedded action buttons that trigger specific agent prompts.

Teams pricing restructure

On June 1, Cursor announced a new Teams pricing structure effective immediately for new customers (renewals start July 1, 2026). 18

Standard seat: $40/month (monthly) / $32/month (annually) — unchanged
Premium seat: $120/month (monthly) / $96/month (annually) — new tier with 5× the usage at 3× the price

Usage splits into two independent pools: Composer+Auto (for Cursor's own models and Auto mode) and third-party API (for external model calls). Admins can set dollar-threshold Slack and email alerts per pool. Cursor's internal Developer Habits Report found that a small fraction of heavy users drive most unpredictable spend; Premium is aimed at that group. 18

The positioning is explicit: Cursor Composer 2.5 (the in-house model available in the Composer+Auto pool) benchmarks near Opus 4.7 and GPT-5.5 quality at $0.50/$2.50 per million tokens. If your heavy users are doing the math, Standard-to-Premium is a better deal than the API top-up costs they were previously incurring.

Windsurf becomes Devin Desktop, and Cognition backs it with a $10M guarantee

On June 2, Cognition rebranded Windsurf as Devin Desktop (v3.0.12), completing the integration of the tool it acquired in July 2025. 19 The rebrand is not cosmetic — the architecture changed substantially.

Devin Desktop — the next generation of Windsurf, built around Devin Cloud and the Agent Command Center — Devin Desktop announcement cover 19

Agent Command Center is now the default interface: a Kanban-style view that manages all local and cloud agents in one place. Agent Client Protocol (ACP), an open standard, lets Devin Desktop run Codex, Claude Agent, OpenCode, and custom internal agents in the same Kanban view with the same context-sharing as Devin itself. 19

Devin Local (the replacement for Cascade) was rewritten in Rust — Cognition claims approximately 30% fewer tokens versus Cascade — and now supports subagent spawning. The old Cascade agent remains usable through July 1 for teams that need time to migrate. 19 On June 4, v3.0.21 added a "Reset migration from Windsurf" command for users who want to wipe Devin Desktop settings and re-import extensions from scratch. 20

Ramp, whose engineers use Devin across several workflows, described the practical benefit: "Devin Desktop makes it easy to dispatch and monitor our array of agents from a single command center." 19 NVIDIA cited multi-agent coordination across complex workflows as their use case; Harvey pointed to context persistence extending to individual engineers' local environments. 19

Pricing, extensions, and existing features carry over unchanged for current Windsurf users.

The $10M productivity guarantee

On June 4, Cognition introduced what it calls the first financial warranty in the AI coding space: if Devin delivers engineering value below what the customer paid for, Cognition will fund the difference in continued usage up to $10M. 21 The guarantee is backed by a productivity measurement system that translates Devin's output into human-equivalent engineering hours. Cognition raised $1B at a $26B valuation in late May with $492M run-rate revenue. 21

The guarantee is targeted at enterprise customers evaluating whether the cost is justified — converting a software spend into a measurable productivity number is the strongest pitch possible in a week when every other platform was struggling to explain its credit math.

OpenAI Codex: model sunsets without warning, account bans on June 5

GPT-5.2 and GPT-5.3-Codex removed

Between June 1 and June 2, OpenAI silently removed GPT-5.2 and GPT-5.3-Codex from Codex for ChatGPT subscribers — no advance notice, no stated reason, no replacement option. 22 The community thread drew 117 replies. User "Dene" summarized the dominant complaint: "gpt-5.3-codex is much better than the new models when it comes to the code and codebase understanding. gpt-5.5 is so mumbo jumbo, simply wasting the tokens and leaving me with such a mess that I have to revert all the changes." 22 User "Flow47": "I think this takes the cake for the craziest AI company move (yet) that impacts my daily workflow... just taking it away, no notice, no nothing, no replacement." 22

Account bans hitting Codex users

On June 5, multiple Codex users began reporting their accounts were banned without warning. Ban emails cite "recent activity violated our Terms and Usage Policies" without specifics. One affected user wrote: "I've just been vibe coding my own apps, I had a personal codex account and one I used for business. I read everywhere that this was okay." 23 OpenAI has not commented. The moderator recommendation is to appeal via help.openai.com.

Whether these are targeted enforcement actions or systematic account review errors is not yet clear from the available community reports.

Codex CLI v0.137.0 and the Bedrock launch

Codex CLI v0.137.0 reached stable on June 4 with multi-agent v2 (per-thread runtime choice, cleaner follow-up defaults), enterprise credit-limit display, remote control pairing via app-server v2 RPC, codex plugin list --json output, and improved web tool availability across code modes. 24 The repository is at 88.9K stars.

On June 1, OpenAI made Codex and its frontier models fully available on Amazon Bedrock — in both AWS Commercial and GovCloud regions. 25

Grok Build v0.2.20: image generation, Claude/Cursor config support, daily release pace

xAI shipped Grok Build from v0.2.8 (May 28) through v0.2.20 (June 3) — roughly 12 releases in six days. 26

v0.2.16 (May 31) was the interop milestone: Claude/Cursor skills, AGENTS.md, MCP servers, and plugins are all now configurable in Grok Build. 26 Teams running Claude Code or Cursor can import their config files directly — no rewrite required. The release also added --permission-mode acceptEdits to auto-approve file edits and enabled btrfs snapshot worktrees via symlink.

v0.2.20 (June 3) added image_to_video and reference_to_video generation tools, a bundled imagine skill, and structured compaction prompts (successor-assistant, carry-forward, and block modes). Monitors are now visible to the model and can be terminated programmatically. Markdown table rendering fixed ghost-cell artifacts. 26

Terminal video playback reached 30fps in v0.2.11; the default retry budget was raised to approximately 5 minutes. Pricing remains $30/month on SuperGrok. xAI has no public GitHub repository for Grok Build — the changelog at x.ai/build/changelog is the only official source. 26

The release pace — daily or faster — is the signal here. Grok Build is still beta, has no public benchmarks, and requires a $299/month SuperGrok Heavy subscription for the full feature set (introductory price $99/month for the first six months). Teams evaluating it should treat it as a moving target.

Brief notes: Replit, Gemini CLI countdown, and the quiet tools

Replit had an active week. On June 4, it launched a Shopify storefront builder that goes "from first prompt to taking real orders in roughly ten minutes." 27 On June 3, a dedicated SEO Agent launched to scan apps for discoverability issues and auto-fix semantic HTML, Open Graph tags, and robots.txt/sitemap.xml. On June 1, Replit announced integration with Microsoft Fabric via an open-source SDK/CLI called Rayfin, enabling AI apps built in Replit to deploy directly into Fabric's enterprise governance layer. 27

Google Antigravity / Gemini CLI: the June 18 shutdown deadline for Gemini CLI (free individual, Google AI Pro, and Google AI Ultra users) is now 13 days away. No new migration guides or user notifications were published this week. The developers.google.com/gemini-code-assist page now redirects to a generic "Build with Gemini" landing page with no Antigravity-specific content. 28 If you haven't migrated to Antigravity CLI, the deadline is real.

Model rankings: the June 2026 BuildMVPFast coding benchmark lists GPT-5.5 first (82.7% on Terminal-Bench 2.0, 58.6% on SWE-Bench Pro), Claude Opus 4.8 second, and Kimi K2.6 (Moonshot AI's 1T-parameter MoE model, 32B active params, Apache 2.0 license) as the top open-weight option at $0.95/$4.00 per million tokens. 29 For teams considering cost-per-merged-change rather than per-token pricing, Cursor Composer 2.5 tops the DeepSWE benchmark ROI analysis. 30

New tools: JetBrains open-sourced Mellum2 (June 1), an on-device coding model positioned for scenarios where Claude Code cannot reach — offline, air-gapped, or latency-constrained environments. 31 SkipLabs shipped Skipper (June 1), a coding agent designed to deliver code without requesting user approval at each step. 32

Uber burned through its full 2026 AI budget in Q1, primarily on Claude Code token costs — reported by Fortune on May 26. 33 The story surfaced widely in engineering management circles this week. A 2026 Q1 survey cited by Alex Merced (AI Weekly) found 42% of developers rate cost volatility as their biggest AI tooling pain point, above model reliability. 15

Quiet tools: Tabnine's last blog post remains May 6; Continue.dev's last release is still March 27 (v1.3.38); Aider's last release is August 2025 (v0.86.0, roughly 10 months ago). 34 35 36

What to watch next week

GitHub's response to the billing backlash. Community Discussion #197089 has received no official reply as of June 5. Whether GitHub adjusts credit allocations, adds spending visibility in the IDE, or simply reopens new-user registration will be the leading indicator of how seriously it takes the criticism.

Gemini CLI shutdown on June 18. With no new migration communications from Google this week, the question is whether users who haven't moved to Antigravity CLI will hit a hard cutoff or whether Google extends the deadline quietly.

The Codex account ban pattern. Multiple ban reports surfaced on June 5 with no clear triggering criteria. If this is a systematic account review rather than targeted enforcement, more reports will emerge. OpenAI's response (or silence) will tell you whether this is a policy problem or a tooling mistake.

Copilot's next move. New user sign-ups remain paused. The longer that holds, the more developers evaluating their first AI coding tool will default to Cursor or Claude Code — tools that built agent-first pricing from the start.

Cover image: AI-generated illustration.