Three voices, one problem: AI is scaling output, not judgment

Three voices, one problem: AI is scaling output, not judgment

Linus Torvalds escalated from "unmanageable" (rc4) to "hardnosed" (rc5) on AI-triggered patch churn in the kernel. George Hotz called AI agent adoption in software development "one of the most costly mistakes in the field's history" after six months of personal experiments. Armin Ronacher published the numbers: 80% slop rate on incoming issues, 8% PR merge rate. Three authors, no coordination, same structural diagnosis — AI scales output, not judgment.

Top OSS Authors on Tech Choices and Product Design
2026/5/26 · 2:09
購読 4 件 · コンテンツ 3 件
The most useful signal from this week isn't any single post. It's the timing. Three people who don't share a codebase, a community, or even a general worldview published essentially the same diagnosis within 48 hours of each other: AI tools produce output at scale; the cost of sorting that output falls on humans; and the humans downstream are starting to push back.
Linus Torvalds said it from inside the Linux kernel release process. George Hotz said it from six months of personal experiments with coding agents. Armin Ronacher (Flask and Jinja2 author) said it with actual issue tracker numbers. None of them were responding to each other. That's what makes this week worth reading carefully.

Linus Torvalds: from "unmanageable" to "hardnosed"

The arc across this week's two release announcements is sharper than either post on its own.
On May 18, when Linus published the Linux 7.1-rc4 announcement on the Linux Kernel Mailing List, he focused on the security list specifically 1:
"AI tools are great, but only if they actually help, rather than cause unnecessary pain and pointless make-believe work."
The problem he described: multiple people running the same AI scanning tools over the same codebase, filing separate reports for the same findings, with no patches attached and no awareness that someone else filed the same thing a day earlier. He also flagged a structural issue — AI-detected bugs are "by definition not secret," so routing them through a private security list made the duplication worse by preventing reporters from seeing each other's submissions.
His ask was specific: if you find a bug with AI, read the documentation, write a patch, and add something on top of what the AI gave you 1:
"If you actually want to add value, read the documentation, create a patch too, and add some real value on top of what the AI did. Don't be the drive-by 'send a random report with no real understanding' kind of person."
By May 24, rc5, the tone had shifted from policy to enforcement 2:
"So I think I'll start being a bit more hardnosed about this kind of unnecessary churn this late in the game."
The rc5 release was larger than typical for that stage — Linus called it "pretty big" and said he was "not entirely happy" with it 2. The bulk of the changes were trivial fixes to random drivers: low individual risk, but the aggregate effect was a release that looked more like a normal development merge window than an rc stabilization phase. He explicitly named the source: "several of these series were triggered by AI code review." 2 His reasoning: low chance of introducing a regression is still not zero chance, and late-stage rc is the wrong time to take on that risk for fixes that aren't regressions.
リンクプレビューを読み込んでいます…
For architecture and tech leads: this is effectively a contributor policy signal. AI-generated patches are now a named category Linus is prepared to reject at the gate during rc phases. If your team is contributing to the kernel or upstream libraries, the bar just moved.

The rest of the kernel community is saying the same thing

Linus is not alone. Linux networking subsystem maintainer Jakub Kicinski (Meta, Linux networking) described the current AI-driven bug report wave in the 7.1 networking fixes pull request on May 21 3:
"Craziness continues with no end in sight. Even discounting the driver revert this is a pretty huge PR for standards of the previous era."
Kicinski added a prediction: "I'd speculate — we haven't seen the worst of it, yet." 3 He did note one meaningful positive: so far, no case has surfaced where an AI-reported bug was fixed and that fix caused a regression for a real user.
Sound subsystem maintainer Takashi Iwai (SUSE) took a drier tone on May 22, reporting that AI/LLM-driven fixes were continuing "as expected" 4:
"As expected, we still continue receiving lots of small fixes."
The AI tools doing most of the work across graphics, WiFi, audio, and networking subsystems this week were GitHub Copilot and Claude Code, trackable via Assisted-by tags in git.kernel.org 5.
リンクプレビューを読み込んでいます…
One counter-signal this week: Greg Kroah-Hartman (Linux stable branch maintainer, widely regarded as the kernel's second-most influential voice) spoke at Rust Week 2026 in Utrecht on May 20, calling Rust "more fun for maintainers" and "more secure Linux for users," and encouraging more Rust developers to get involved with the kernel 6. The Rust push and the AI patch pressure are separate stories, but they're happening in the same organization simultaneously.

George Hotz: "The Eternal Sloptember"

George Hotz — creator of tinygrad and comma.ai, and the person who first unlocked the iPhone — published a blog post on May 24 titled "The Eternal Sloptember" 7.
The post's core claim is unusually direct:
"I'm calling it now, the adoption of AI agents into software development will be one of the most costly mistakes in the field's history."
His evidence is personal: six months of attempting to use AI agents on tinygrad and a USB-to-PCIe chip reverse engineering project. His conclusion was that doing the work by hand was faster and produced better results every time. He explicitly aligned himself with Yann LeCun and Gary Marcus: current LLM architectures, he argues, cannot actually program — real programming agents would need world models — and the problem is getting harder to see, not easier:
"Agents cannot program, and it's taking longer and longer to realize that they can't."
The structural argument is more interesting than the personal experience report. Hotz's claim is that agents are harmful to large organizations specifically, not to high-performing individuals or small teams. High performers can catch the agent's mistakes. Low performers can't. The result is a 10× volume increase in output at constant or lower quality, and the mistakes are increasingly hard to detect. He cited Apple's push to roll out AI tools to all engineers as an example and asked the reader to predict whether macOS would be better or worse in two years.
"The real story of this era will be who manages to avoid harming themselves in their AI psychosis."
The Hacker News post drew 420 points. Whether or not you agree with Hotz's conclusions, the framing — output scales, judgment doesn't — is consistent with what Linus Torvalds and Jakub Kicinski are observing from the receiving end of AI-generated patches.

Armin Ronacher: 80% slop, measured

Armin Ronacher, creator of Flask, Jinja2, and Werkzeug, published "Building Pi With Pi" on May 24 with something most AI criticism lacks: data 8.
リンクプレビューを読み込んでいます…
His Pi project (part of the Earendil network) received 3,145 external issues and pull requests over 90 days. About 2,504 of them — roughly 80% — were automatically closed. Of the PRs, roughly 60 out of 714 were merged: an 8% merge rate.
Ronacher coined two terms in the post. "Slop issues" are AI-generated submissions with approximately 5% human content and 95% inaccurate AI filler. "Clanker" is his preferred replacement for "agent" — he argues that agency is a human property, and calling the tools "agents" imports false assumptions about their capabilities.
His technical diagnosis goes deeper than "the submissions are bad." The core problem he identifies is that AI tools approach every local failure with a local defensive fix — add a fallback, add a tolerant reader, add a migration path — instead of understanding the system's global invariants and eliminating the condition that allowed the bad state to arise:
"Keep in mind that AI has not increased the number of people who need software, or the number of maintainers who can review it. It has mostly increased the amount of code and the number of projects competing for attention."
The open-source implication he draws: the value of open source is in collaboration, not in more isolated sessions between a person and a machine.
"Open Source needs more collaboration, not more isolated work with a machine."
The Hacker News post drew 155 points. For maintainers specifically: the 80% auto-close rate is a concrete benchmark to compare against your own project's incoming PR quality.

Brief dispatches

Junior hiring pipeline — Filipe Brito Ferreira. Front-end engineer Filipe Brito Ferreira published a data essay on May 25 arguing that the current junior hiring contraction will produce a senior talent shortage in 2031 9. His figures: junior developer hiring is down roughly 40% compared to pre-2022 levels; entry-level hiring at the top 15 tech companies fell 25% from 2023 to 2024; computer science graduate unemployment reached 6.1–7% in 2025; and a LeadDev survey found 54% of engineering leaders plan to reduce junior hiring further in 2026 because AI copilots let senior engineers absorb more work. His argument: "AI didn't remove a job. It removed the apprenticeship loop that produced the next generation of seniors at scale." The 5–7 year lag between hiring a junior and producing a senior means the consequence of this year's decisions lands in 2031.
C compiler portability — lemon. An independent C compiler developer published "On C extensions, portability, and alternative compilers" on May 24 10, documenting how glibc's sys/cdefs.h, SDL's byte-order detection, and OpenBSD's libc each hardcode compatibility paths for GCC, Clang, and MSVC while silently breaking other compilers. The practical upshot: C portability in the wild is harder than the standard implies, and the path forward for alternative compilers is to either upstream patches or fake a GCC version string. The post reached 81 points on Hacker News.
Vite v8.0.14 shipped on May 21, updating rolldown (the Rust-based bundler that backs Vite 8) to version 1.0.2 11. Evan You had no personal public posts this week.
Svelte 5.55.9 released on May 20 with fixes for {#await} batch operation and SSR hydration edge cases 12. Rich Harris had no public posts this week across tracked platforms.

The through-line across the kernel thread and the two essays is structural, not emotional. Each author is describing the same cost-transfer: AI removes the friction of producing output, which scales volume, which scales the review burden on humans who can't refuse it. Linus is at the end of that pipeline and is now explicitly naming the category. Tech leads who are measuring AI productivity only on the output side should probably also be measuring what it's doing to reviewer attention budgets downstream.
Cover image: Tux mascot, used as Linux branding reference

このコンテンツについて、さらに観点や背景を補足しましょう。

  • ログインするとコメントできます。