Doom is the pitch deck, the harness is 90% (2026)

George Hotz flew back from Berkeley and called AI doomerism a marketing strategy. Addy Osmani published a whitepaper arguing the model is only 10% of what matters. Daniel Stenberg announced curl 8.21.0 is shipping Wednesday with 98 contributors and a record 18 CVEs — and the New York Attorney General's office called. A useful week.

Coverage window: June 15–22, 2026.

George Hotz: doom narratives exist because the technology doesn't justify the valuation

Hotz (creator of tinygrad, comma.ai; author of the first iPhone unlock) published "The doom justifies the valuation" on June 21 — his second consecutive AI economics post after returning from a philosophy-focused break. 1 The argument is sharper and more direct than his June 11 deflation post.

After two weeks in Berkeley, he describes what he saw: "This is a cult of atheistic hedonists needing AI doom to be true to justify their life choices." 1 His diagnosis for why Anthropic publishes policy documents instead of technical blog posts: "It's all just nonsensical hype, it's not about technology. It's a questionable promise of future technology. The reason they can't just write technical blog posts is that the current technology doesn't justify the valuation." 1

The contrast he offers is Zhipu AI's GLM-5.2 blog, which dropped June 16 as an open-source (MIT) frontier model with a 1M-token context window, benchmarks comparing against Claude Opus 4.8 and GPT-5.5, and architecture detail on how they reduced per-token FLOPs by 2.9× at 1M context. 2 "Read the GLM-5.2 blog post. This is how I imagined AI progress being, and this is frontier stuff, the model is on par with Opus 4.8 and GPT-5.5. It's a pleasant technical look into how things are slowly improving." 1 He puts comma.ai's openpilot 0.11.1 release post in the same bucket — honest engineering writing that documents what the technology actually does. 3

GLM-5.2 Long-Horizon Task Evaluation bar chart: FrontierSWE 74.4%, PostTrainBench 34.3%, SWE-Marathon 13.0% — leading or near-leading against Claude Opus 4.8 and GPT-5.5 — GLM-5.2 long-horizon benchmark results from the Zhipu AI blog — the kind of chart Hotz contrasts with Anthropic's policy documents. 2

Hotz quotes an independent PDF called "Schizoposting" to articulate what the doom narrative accomplishes: "The only possible conclusion is that it's designed to cause panic. In fact, it is optimized for it: there is no possible framing of the actual product(s) that could possibly induce more psychological spiraling in the media and its audience." 1 He also cites a May 22 Forbes article in which Goldman Sachs CEO David Solomon called AI mass-unemployment fears "overblown" — Hotz reads Solomon as "the adults in the room starting to call BS." 4

The post closes with two untranslated Chinese terms: 内卷 (nèijuǎn — involution, the zero-sum competitive spiral where everyone races for diminishing returns) and 摆烂 (bǎilàn — giving up, letting things rot rather than continuing to play a broken game). 1 Placed at structural pauses in the text, they function as verdict: the AI hype cycle is 内卷, and the rational response might be 摆烂. He asks: "Can someone write an AI 2027 but instead of some totalizing doom propaganda it talks about the bubble unwinding and what we can do to prevent this kind of crap in the future?" 1

This is the second leg of a two-part argument. The June 11 post argued AI will crash knowledge-worker wages because technology prices converge to cost. The June 21 post explains how the bubble stays inflated until that happens: doom marketing fills the gap between current technology and current valuations. 1

Addy Osmani: the model is 10%, the harness is 90%

Addy Osmani (Director of Engineering at Google Cloud AI) published "The New Software Lifecycle" on June 16 — a blog post distilling a Google whitepaper co-authored with Shubham Saboo and Sokratis Kartakis, published simultaneously on Kaggle. 5 6 The central claim is that most teams are optimizing the wrong thing.

The framework: Agent = Model (10%) + Harness (90%). The harness is everything else — instruction and rule files, tools and MCP servers, orchestration logic, sub-agent routing, guardrails, and observability. 5 Two public experiments support the magnitude. On Terminal Bench 2.0, a team moved a coding agent from outside the top 30 to the top 5 by changing only the harness, with the same underlying model. LangChain improved scores by 13.7 points on the same benchmark by changing only the system prompt, tools, and middleware — again, same model. 5 Osmani's summary: "The model is the engine. The harness is the car, the road, and the traffic laws." 5

Agent architecture: LLM at center, surrounded by Framework Layer (instructions, tools, MCP servers, orchestration, guardrails), Developer Interface, and Cloud Infrastructure. Model ~10%, Harness ~90%. — Osmani's Agent = Model + Harness architecture diagram. 5

The practical implication is where Osmani spends most of his analysis. Context engineering — how you load what the agent sees — is the highest-leverage variable in the harness. Context splits into six types (instructions, knowledge, memory, examples, tools, guardrails) and into static (always loaded, high token cost) vs. dynamic (loaded on demand, lower per-turn cost). Agent Skills use progressive disclosure: at startup, the agent sees only metadata; it loads full instructions when a task matches; it pulls reference material only when needed. The result: one agent can carry dozens of skills and pay only for what it actually uses. 5

Osmani also puts numbers on what AI has and hasn't changed in the SDLC. Implementation time compresses from weeks to hours. But requirements, architecture, and verification remain slow — they're judgment work that agents don't have the business context to do. 5 Two data points on the productivity gap: industry surveys show 25–39% improvement in developer throughput, but a METR study found experienced developers were 19% slower on certain tasks when factoring in the time spent reviewing and correcting AI output. 5 Osmani's framing: "Implementation is where the gains and the caveats both live. The honest summary is that AI turns implementation from writing into reviewing." 5

Verification, he argues, is the new dividing line between vibe coding and agentic engineering. The whitepaper proposes two mechanisms: output evaluation (is the final result correct?) and trajectory evaluation (was the path to get there — the tool calls and reasoning — sound?). His one-liner for team leaders: "If I had to hand a leader one line from the paper, it's this: set the bar at the eval, not the demo." 5

The TCO economics make the case concrete. Vibe coding looks cheap upfront — the "prompting tax," token burn, maintenance overhead, and security risk all accrue later. After the crossover point, vibe coding costs 3–10× more per feature than disciplined agentic engineering. 5 On where failure actually lives: "Most agent failures are configuration failures. I find that encouraging, because configuration is the part I can fix today, without waiting for a better model." 5

The post reached roughly 1.4M views on X across two posts on June 15 and June 20. 7 8

On loop engineering. On June 22, Osmani published a second piece on O'Reilly titled "Loop Engineering" — formalizing a concept that Boris Cherny (Claude Code lead at Anthropic) and Peter Steinberger (OpenClaw creator) had articulated in early June. Cherny: "I don't prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops." 9 Steinberger: "You shouldn't be prompting coding agents anymore. You should be designing loops that prompt your agents." 9

Osmani's six-component anatomy of a loop: automation (scheduling and discovering work), work trees (isolating parallel agents), skills (encoding project knowledge), connectors (wiring agents to real tools), sub-agents (separate producer and checker roles), and external state (tracking what's done and what's pending). 9 He was struck that Claude Code and Codex independently arrived at the same six primitives. His caution: "The loop changes the work; it does not delete you from it." 9 Three risks he names directly — validation is still your job, comprehension debt (the gap between what the loop ships and what you actually understand) widens as the loop speeds up, and cognitive surrender (just accepting whatever the loop produces) is the most comfortable and most dangerous posture. 9

Daniel Stenberg: QUERY lands in an RFC, curl 8.21.0 ships Wednesday

RFC 10008 — "The HTTP QUERY Method" — was published June 15 as an IETF Proposed Standard by J. Reschke, J.M. Snell, and M. Bishop. 10 Daniel Stenberg (creator of curl, maintainer since 1998) published "QUERY with curl" on June 21, explaining why the method matters and what to watch out for. 11

The gap QUERY fills: GET is safe and idempotent but can't carry a request body with defined semantics. POST can carry a body but isn't idempotent — sending it twice might create two records. QUERY is safe and idempotent, carries a body, and can be automatically retried after connection failures without concern for partial state changes. 10 Stenberg's shorthand: "For all practical purposes you can think of QUERY as a way to send a GET with a body." 11

Three design motivations from the RFC: URLs above roughly 8,000 bytes become unreliable across servers and intermediaries. 11 Encoding complex data into a URL carries encoding overhead. And URLs are more likely to be logged by proxies and intermediaries than request bodies — QUERY improves privacy for sensitive query payloads. 10

Curl supports it out of the box: curl -d "data" -X QUERY https://example.com/. No new code needed — --request already handles arbitrary method names. 11 One redirect caveat Stenberg flags as critical: when following redirects with QUERY, use --follow (not the older --location/-L). The old flag changes the HTTP method on all subsequent requests regardless of response codes; --follow respects HTTP semantics. He notes his earlier work on --follow "was largely motivated by this method, and methods like this." 11

curl 8.21.0 countdown. The release is confirmed for Wednesday, June 24. RC3 shipped June 17 with no regressions. 12 The final numbers as of RC3: 18 pending CVEs, approximately 250 bug fixes, and a contributor count Stenberg called a milestone: "This time around we have gotten help from a record amount of contributors — right now 98 named individuals. No other release in curl history has had this many people to thank." 12 A live-streamed release video is planned.

The Trail of Bits security audit (one engineer plus AI, one week of work) wrapped up: all 22 issues addressed, one became a pending CVE. 13 Daniel also spotted a glibc strftime %s bug — the glibc implementation incorrectly applies a timezone offset to the %s format (seconds since epoch) — and found it documented in the glibc man page as intentional behavior. Curl implemented its own %s support as a workaround. 12

curl Summer of Bliss 2026 — official red stamp seal on white paper — Daniel Stenberg's "curl Summer of Bliss 2026" official stamp. 13

Summer of bliss: broader impact. The June 15 announcement — curl will not accept vulnerability reports during July — has drawn a response well beyond the curl community. The HN submission hit 787 points and 316 comments, almost unanimously supportive. 14 Three other open source projects — libexpat, ImageMagick, and OctoPrint — announced similar pauses in the weeks following. 12

The unexpected development: on June 18, the New York State Attorney General's office contacted Daniel asking for a call to discuss "what policies might effectively support the efforts of maintainers to stay ahead of the volume of vulnerabilities being discovered using LLMs." 13 His public response on Mastodon: "If I get a voice I think I better use it for good. I've agreed to this meeting." 13 He's also submitted a talk for NSSS 2026 titled "Deluge, Distill, Defend: Mastering the AI Vuln Overflow." The quiet curl maintainer who took July off is now in a conversation with the state of New York about AI vulnerability policy.

Brief dispatches

DHH — defense tech investment and server sticker shock. David Heinemeier Hansson (creator of Ruby on Rails, co-founder of 37signals) published "European Delusions & Danish Drones" on June 19, announcing a personal investment in Upteko, a Danish drone startup with hardware battle-proven in Ukraine. 15 His case: "Europe needs its own Andurils. Not because it can't also continue to buy systems from the Americans, but because a good ally is self-sufficient, equally inventive, and armed to the teeth with a diverse fleet of awesome, native weapons." 15 The software angle he sees: "The future of drones is as much in software as in propulsion. And I know something about that." 15

Separately, on June 19, DHH posted that 37signals walked away from a server hardware purchase for non-essential infrastructure projects after finding that machines which cost around $40,000 in January now cost over $100,000 — a roughly 2.5× increase he attributes directly to AI demand. "Econ 101 is working: We're holding off. Letting the capacity flow to the higher bidder!" 16 The tweet drew 1,964 likes and 169K views.

Armin Ronacher — loop engineering limits and the GitHub issue problem. Armin Ronacher (creator of Flask, Jinja2, Werkzeug; co-maintainer of the Pi coding agent) ran AI coding loop experiments over the weekend of June 21–22 and published his findings on X: "The only cases where they work so far for me are a) review b) research c) autoresearch." 17 Loops for actual implementation work on medium-sized projects have not worked for him — he's asking the community for examples of successful implementation-level looping to examine. The tweet got 411 likes and 145K views, suggesting the result resonated. 17

On the maintainer side, Armin posted a separate observation the same day about LLM-generated GitHub issues: "Without fail, when I push back on an LLM-generated issue, the author eventually returns with one or two clear paragraphs explaining the actual problem. That should have been the issue." 18 The pattern is consistent with Miguel Grinberg's "Reverse Centaur" post from June 12, where Grinberg announced he'd start closing unsolicited AI-generated PRs. Two maintainers, independently, landing on the same friction point.

He also shipped a Pi agent weekend update: /new command faster when loading many extensions, GLM-5.2 model configurations fixed on some providers, new VLLM reasoning configuration options, and the edit tool's fuzzy algorithm no longer touches unmodified lines. 19

Rich Harris — performance as infrastructure cost. Rich Harris (creator of Svelte, engineer at Vercel) posted on Bluesky around June 20 on the compounding economics of faster software: shipping faster reduces token costs, server costs, and bandwidth costs simultaneously. 20 No new Svelte blog posts this week; the official blog is on a monthly cadence. Harris's Bluesky post connects to themes from his June 11 SvelteKit 3 workshop on Frontend Masters, where he covered performance optimization in the new framework.

Evan You / VoidZero, Linus Torvalds, Miguel Grinberg — no new public statements within the June 15–22 window. The Linux 7.2 merge window is active with code integrations, but Linus's opinionated commentary typically lands with RC1, expected around June 28. Evan You is in a post-acquisition integration period following VoidZero joining Cloudflare on June 4. Grinberg's most recent post remains the June 12 Reverse Centaur piece.

The week's through-line is two coexisting reactions to the same AI moment. Hotz and Armin are pushing back from outside the system — doom narratives are marketing, LLM-generated issues are noise, loops don't yet work for real implementation. Addy is pushing back from inside, with data and a constructive framework: stop optimizing the model and fix your harness. Both responses assume the same premise — that most current AI discourse is not engineering. Daniel Stenberg, by taking July off and drawing the attention of a state attorney general, ended up writing policy whether he meant to or not.

Cover image: AI-generated.

Doom is the pitch deck, the harness is 90%

George Hotz: doom narratives exist because the technology doesn't justify the valuation

Addy Osmani: the model is 10%, the harness is 90%

Daniel Stenberg: QUERY lands in an RFC, curl 8.21.0 ships Wednesday

Brief dispatches

参考ソース

関連コンテンツ

Marcus：Mythos 唯一指数级增长是成本；Mollick：Anthropic 是真担心，但没说清楚

Best of your X follows: June 11

Anthropic 估值 $950B 狂奔背后