The billing reset: AI coding tools, spring 2026

For three years, the dominant pricing model in AI coding tools was simple: pay a flat monthly fee, get a lot of completions, and use the agent however much you want. That era is ending. In the span of a few weeks this spring, GitHub Copilot announced it would move to consumption-based billing on June 1, Windsurf rolled out per-token pricing via its Adaptive routing system, and Devin restructured its self-serve tiers entirely. The driver in every case is the same: agentic sessions — multi-step tasks where an AI works through an entire codebase over minutes or hours — cost a fundamentally different order of magnitude than chat completions. A single Claude Opus 4.7 agent run can consume more compute than 50 standard chat turns. The flat-rate model was subsidizing that, and the subsidy is over.

That billing reset is the organizing frame for reading all of this window's major moves. GitHub Copilot is managing the pain of the transition. Cursor is racing to lock in enterprise dependency before the market consolidates. Windsurf is bundling Devin Cloud to justify a premium tier. Tabnine has exited the consumer market entirely. Amazon has shut down Q Developer and replaced it with a new product. And Claude Code and OpenAI Codex have gone from sideline players to serious share threats.

The market itself is contested territory: analyst estimates range from $9.5B to $12.8B in 2026, and the disagreement reflects genuine uncertainty about whether the value layer sits at the IDE, the model, or the enterprise integration stack.

GitHub Copilot: the most significant pricing overhaul in the product's history

The sequence of announcements from GitHub between April 20 and May 12 tells a story about a product running into the economics it tried to avoid.

On April 20, GitHub paused new sign-ups for Copilot Pro, Pro+, and Student plans, removed Claude Opus from the Pro tier (leaving it only in Pro+), and introduced session and weekly token caps. 1 GitHub VP of Product Joe Binder was direct about why: "Agentic workflows have fundamentally changed Copilot's compute demands. Long-running, parallelized sessions now regularly consume far more resources than the original plan structure was built to support." 1 He put the specific problem plainly: "it's now common for a handful of requests to incur costs that exceed the plan price." 1

Seven days later, GitHub announced the new billing model: starting June 1, all Copilot plans migrate to GitHub AI Credits, where 1 credit = $0.01. 2 Base plan prices don't change — Pro stays $10/month, Pro+ $39/month, Business $19/seat, Enterprise $39/seat — but each plan now includes a monthly credit allotment equal to its price in dollars. 2 Code completions and Next Edit suggestions remain free. What costs credits: any chat, agent mode, or code review task that uses a premium model.

On May 12, after community feedback made clear that the original allotments were too small, GitHub added Flex allotments and a new Max plan: 3

Plan	Price	Total monthly credits
Pro	$10/month	$15 (base $10 + flex $5)
Pro+	$39/month	$70 (base $39 + flex $31)
Max (new)	$100/month	$200 (base $100 + flex $100)
Business	$19/seat	$30 promotional (Jun–Aug)
Enterprise	$39/seat	$70 promotional (Jun–Aug)

The math still troubles heavy agent users. Independent analysis from UsageBox estimates that a Pro+ user running a typical mix of agent mode sessions and code review with frontier models (Claude Opus 4.7 or GPT-5.5) can exhaust $39 in credits in roughly three to four days. 4 UsageBox's summary of who gets hurt: "A subset of teams that lean on agent mode and Pro+ models are going to pay noticeably more, and they are also the teams that don't know that yet." 4

Community reaction was loud. The HN thread on the April 27 announcement reached 767 points and 553 comments. 5 Reddit's r/GithubCopilot saw a wave of cancellations, with users migrating to OpenAI Codex, Cursor, Claude Code, and DeepSeek. 6 One user who had subscribed since Copilot's public beta in 2021 wrote simply: "it's time to say goodbye." 6

news.ycombinator.comhttps://news.ycombinator.com/item?id=47923357External link

Loading content card…

Not everyone read it negatively. Enterprise buyers with 150+ seats pointed out that Copilot's model — where subscription fees convert 1:1 to usable credits — compares favorably to Claude Enterprise, which charges a seat fee and provides no included usage budget. The credits system is, as UsageBox put it, "more honest" than the premium-request model it replaces — it just exposes costs that were previously invisible. 4

The strategic read: GitHub is protecting margin while Copilot serves 140,000 organizations — a 3x year-over-year increase. 7 The billing change risks developer goodwill but is structurally necessary. The real question is whether Copilot's GitHub platform integration — code review, issues, CI/CD — keeps enterprises that might otherwise consolidate onto Cursor or Codex. GitHub CPO Mario Rodriguez framed the product's differentiator clearly: "The bottleneck has moved to delivering software — review, security, governance and deployment. Copilot covers the complete surface of the SDLC." 7

Cursor: revenue momentum meets enterprise ambition

Cursor (made by Anysphere) reported $2 billion in annualized revenue in February, up from roughly $1 billion three months earlier — one of the faster ARR climbs in SaaS history. 8 Enterprise clients now account for roughly 60% of that revenue, which means the individual-developer churn to Claude Code and other cheaper tools is being absorbed by higher-value corporate contracts. 8 A separate April report put Anysphere in talks to raise $2B+ at a $50B pre-money valuation, with a projected 2026 year-end ARR exceeding $6 billion. 9

The product moves over the window all point in one direction: building surfaces that are harder to abandon.

Cursor 3 (April 2), built from scratch rather than as a VS Code fork, introduced multi-repository layouts, parallel agent execution (local and cloud agents visible in the same sidebar), and integrations with Slack, GitHub, Linear, and Jira. 10 Co-founders Michael Truell and Sualeh Asif described the design intent as fixing the "micromanagement problem" — developers tracking multiple agents across separate terminals and windows. Cursor 3 unifies them.

Composer 2.5 (May 18), Cursor's proprietary coding model, brought a meaningful quality jump over Composer 2. It introduces targeted textual feedback reinforcement learning — instead of scoring full rollouts, the training inserts error signals directly at the point in a sequence where the mistake occurred — and uses 25x more synthetic training tasks than Composer 2. 11 Fast mode pricing doubled from $1.50/$7.50 to $3.00/$15.00 per million tokens, but first-week users get double usage allocation. 11 Cursor also announced a training partnership with SpaceX to use xAI's Colossus 2 infrastructure — one million H100-equivalent chips — for a next-generation model with 10x the compute of Composer 2. 12

Bugbot moved from a $40/seat/month subscription to pure usage-based billing effective after June 8, averaging $1.00–$1.50 per PR review. 13 A higher-effort tier detects 35% more bugs per run. The shift tracks with Cursor's broader per-consumption model.

The Cursor SDK (April 29, public beta), installable via npm install @cursor/sdk, lets teams launch agents programmatically against cloud, local, or self-hosted runtimes. 14 Customers including Faire, Rippling, Notion, and C3 AI are already using it. George Jacob, Faire's senior engineering manager, described the value: using Cursor's cloud runtime to run many parallel agents "without managing VMs or working around memory limits." 14 The SDK is the unlock that lets Cursor embed into CI/CD pipelines rather than staying inside an IDE.

Gartner named Cursor a Leader in its 2026 Magic Quadrant for Enterprise AI Coding Agents, with the furthest placement on the Completeness of Vision axis. 15 More than 70% of the Fortune 500 now use Cursor to deploy and manage coding agents. 15

cursor.comhttps://cursor.com/blog/third-eraExternal link

Loading content card…

The strategic read: Cursor CEO Michael Truell has described the third era of AI software development as moving from agent conversations to autonomous cloud agents that create the software factory itself — "Cursor is no longer primarily about writing code. It is about helping developers build the factory that creates their software." 16 The SDK, the JetBrains integration (via the Agent Client Protocol, launched March 4), Bugbot, and Composer 2.5 are all execution on that thesis. The risk is that Cursor's pricing is structurally higher than Claude Code or Codex for individual users — the individual developer exodus to cheaper tools is real, even if enterprise revenue is compensating for it so far.

Windsurf: Devin inside the IDE, per-token pricing begins

Windsurf's biggest move of the window was Windsurf 2.0 (April 15), which embedded Cognition's Devin Cloud directly into the IDE. 17 All self-serve plans — Pro ($20/month), Max ($200/month), and Teams ($40/user/month) — now include Devin Cloud access. 18 The new Agent Command Center presents all local and cloud agent sessions in a kanban view organized by status, and Spaces group sessions, PRs, and files into task-level contexts.

The pricing architecture shift came earlier: Adaptive (April 6) introduced a model routing option that selects the cheapest model appropriate for each task, billed at per-token rates rather than flat credits. 17 19 Promotional pricing during the launch window was $0.50/1M input tokens and $2.00/1M output tokens for additional usage. 19 In-IDE token counters and per-token pricing display in the model selector make consumption visible — the same transparency move GitHub is making with credits.

Windsurf added 8+ frontier models over the window, including GPT-5.4 (March 5), GPT-5.5 (April 24), Claude Opus 4.7 (April 16), and Opus 4.7 fast mode (May 12) — each at promotional credit multipliers. 20 Devin for Terminal (April 28), written in Rust, lets users run the Devin agent engine from the CLI, with handoff to Devin Cloud for longer sessions, and reportedly saves up to 30% on tokens compared to Cascade. 20

The strategic read: Windsurf's bet is that Devin bundled at no extra charge converts its mid-tier users into high-frequency agents users, who then cross-sell themselves into Teams and Max. The Adaptive routing is an attempt to compete on value efficiency — rather than matching Cursor's raw feature depth, Windsurf is positioning as the tool that doesn't waste your budget on over-powered models for simple tasks. Whether that message lands depends on how much the typical user actually cares about token-level optimization versus raw output quality.

Tabnine: governance over growth

Tabnine has completed a pivot that started last year: the free Basic tier is gone, the standalone Pro plan is gone, and the product is now enterprise-only. 21 Current pricing is Code Assistant at $39/user/month (code completion, AI chat, IDE support, SSO) and Agentic Platform at $59/user/month (everything in Code Assistant plus autonomous agents, MCP tools, CLI, and the Context Engine). 21 Both tiers require contacting sales; there is no self-serve purchase path.

The product releases over the window all focused on enterprise security controls: Tabnine 6.1 (April 9) introduced CLI sandboxing, three-level command permissions (auto-approve / confirm / disabled) configurable by command prefix, and hard workspace-scope limits on file access that prevent agents from reading /etc/passwd or ~/.ssh. 22 CEO Chris du Toit's framing: "Enterprises are not asking, 'Can AI help our developers move faster?' They are asking, 'Can we trust it inside our systems?'" 22

The Enterprise Context Engine (GA February 26) continuously models an organization's codebase — repositories, services, dependencies, APIs, architecture relationships — and exposes it as a Skill that agents can call. 23 The positioning is that Tabnine doesn't replace Cursor, Copilot, or Claude Code — it makes them more accurate inside a specific enterprise codebase.

The strategic read: Tabnine is betting that the governance and compliance requirements of large enterprises will create a durable layer above the IDE-level competition. The risk is market access: at $39–$59/user/month with no free tier, Tabnine can't build the developer mindshare that drives bottom-up enterprise adoption the way Cursor and Copilot can.

Contenders and exits: Claude Code, Codex, and Amazon's pivot

The window saw two serious entrants gain significant ground and one established player exit.

Claude Code (from Anthropic) had its most visible window yet. At Code with Claude 2026 in San Francisco (May 6), Anthropic released Managed Agents with a Dreaming capability — where an agent writes notes to itself during a task that future agents can read, enabling knowledge continuity across sessions. 24 Claude Code also gained auto mode (an automatic permission-decision classifier), routines (cron/webhook/API-triggered automation), and remote control across devices. 24 The underlying model, Claude Opus 4.7, scored 87% on SWE-bench Verified — compared to 62% for Sonnet 3.7 a year earlier. 24 Dario Amodei reported Anthropic's revenue at approximately $30 billion annualized — 80x growth in a single quarter. 24 Anthropic's Claude engineering lead Katelyn Lesse described Claude as currently at "mid-level engineer" coding capability, with senior-engineer system design as the next milestone. 25

OpenAI Codex hit 4 million weekly active developers — up roughly 6.7x from ~600,000 at the start of 2026. 26 Gartner named OpenAI a Leader in the same MQ that recognized GitHub and Cursor. Codex added mobile access (ChatGPT iOS/Android preview, May 14), Remote SSH for connecting directly to hosted dev environments, Hooks for automated pre/post-task scripts, and HIPAA compliance for ChatGPT Enterprise workspaces. 27 Cisco reported using Codex to build much of its AI Defense security platform, cutting delivery time from quarters to weeks. 26

Amazon Q Developer is shutting down. New sign-ups stopped May 15, Opus 4.6 was removed from the Pro tier on May 29, and full end-of-support hits April 30, 2027. 28 The replacement is Kiro (kiro.dev), a new agentic development environment built around spec-driven development: structured requirement documents (Specs) drive end-to-end implementation, with Hooks for file-save/commit-triggered automation and Steering files for persistent project context. 28 AWS's explanation: "The most impactful AI developer experiences go far beyond code generation and completion. Developers need AI that understands an entire project — architecture, requirements, testing, and code intent."

Quick signals worth tracking

Devin's self-serve pricing reset: Cognition replaced its original Core ($500/month) and Team plans with Free / Pro ($20/month) / Max ($200/month) / Teams (minimum $80/month). 29 Ask Devin Deep Mode, Devin Review, and DeepWiki moved from flat to usage-based billing. The $500 entry price was broadly cited as the main barrier to adoption; $80/month is a meaningful reduction.

Replit self-serve enterprise: Organizations can now buy and configure Replit Enterprise without a sales process, with SSO, SCIM, and RBAC included. 30 The key signal: enterprise AI tooling is compressing sales cycles, not extending them.

The Gartner quadrant: The 2026 Magic Quadrant for Enterprise AI Coding Agents places GitHub (highest on Ability to Execute), Anthropic, Cursor, and OpenAI in the Leaders quadrant. 7 Cognition, AWS, Google, and Alibaba Cloud are Challengers. Tabnine sits in Visionaries. JetBrains is a Niche Player.

Gartner 2026 Magic Quadrant for Enterprise AI Coding Agents — Leaders quadrant: GitHub (highest on Ability to Execute), Anthropic, Cursor, OpenAI 7

This quadrant is increasingly the enterprise buyer's first filter — placement here matters regardless of how developers feel about the individual tools.

The trust gap: Stack Overflow's 2025 developer survey found that 84% of developers use or plan to use AI tools, but only 29% trust them — down from 40% in 2024. 31 The gap reflects a genuine calibration problem: AI coding tools have gotten measurably better at benchmark tasks (Claude Code's 87% SWE-bench vs 62% a year ago) but the failure modes — hallucinated APIs, subtly wrong refactors, leaked .env files — remain hard to detect without reading the code. The developer who read no Copilot-generated code before merging it (at Code with Claude London, roughly half the audience admitted to this) 25 is making a bet that tooling quality now exceeds human review value. That bet may be premature — but it's being made at scale.

Three variables to watch into Q3 2026:

Copilot retention after June 1. The billing change takes effect June 1 and affects all individual plans. The first credit statements will land in July. Whether the community backlash translates into sustained churn or settles into acceptance after users see actual bills will set the tone for whether other flat-rate holdouts (Windsurf's standard plan, Cursor's Pro) face pressure to follow.
Cursor's next model. The SpaceX/Colossus 2 training run — 10x the compute of Composer 2 — is in progress. If Cursor ships a frontier-quality proprietary model before end of year, the argument for using third-party models (Claude, GPT-5.5) inside Cursor weakens. That's a different competitive position than where the company started.
Kiro's reception. Amazon's Q Developer had meaningful enterprise penetration before the shutdown announcement. Kiro needs to convert Q Developer users before the 2027 EOS deadline while simultaneously earning new users on its own terms. How the developer community receives spec-driven development as a workflow — versus the more improvisational agent-session style Cursor and Windsurf favor — will be an interesting signal about where enterprise AI coding is headed.

Cover image: AI generated illustration