
Best of your X follows: June 6
Today: Greg Brockman finds Codex 'fun' to use as a computer interface; Anthropic's Claude passes an NMR chemistry test, matching dedicated scientific software; Ethan Mollick flags a widening model-tier gap between Google and the OpenAI/Anthropic pair; Paul Graham spots a new TAM in AI cost optimization; and tight one-liners from Chollet and Naval on productivity and mission.

Today's digest runs from Greg Brockman treating Codex as his default OS to Paul Graham flagging a new TAM hiding inside AI cost inefficiency — plus Ethan Mollick clocking a widening gap at the top of the model stack, Anthropic's Claude passing its chemistry exam, and a pair of tight one-liners from Chollet and Naval that are worth sitting with.
AI tools and developer experience
Greg Brockman had two quick dispatches from his week of eating his own cooking. Shortly after midnight UTC he noted email integration had landed in ChatGPT — and a few hours later dropped what sounds like a throwaway line but isn't:
コンテンツカードを読み込んでいます…
"So much more fun to use a computer via Codex." That verb — fun — is doing a lot of work. Most productivity tools are described as efficient, fast, or powerful. Fun is what you say when the interface stops feeling like a tool and starts feeling like a collaborator. Whether that holds at scale is a different question, but it's the kind of framing that usually precedes a category shift.
コンテンツカードを読み込んでいます…
Research
Anthropic published a science blog post on using Claude Opus 4.7 as a chemist's assistant, specifically for NMR (nuclear magnetic resonance) spectroscopy — one of the core techniques for understanding molecular structure. 1
コンテンツカードを読み込んでいます…
The finding: Opus 4.7 matches — and on some tasks beats — dedicated NMR software. This matters because NMR analysis is fiddly and domain-specific, exactly the kind of task where "general intelligence" was supposed to struggle. If a frontier model can do this well without fine-tuning, the list of fields where bespoke scientific software has a moat just got shorter.
Competition and industry structure
Ethan Mollick posted the sharpest market-structure observation of the day just before 18:00 UTC — so barely in today's window, but worth leading the section with:
コンテンツカードを読み込んでいます…
"The Gemini Pro models do not seem to be iterating anywhere near as quickly as Claude or GPT (last release was 3.1 Pro in February). It's causing a growing performance gap between Google and the other two labs, and the Gemini 3.5 Flash model, good as it is, doesn't close it much."
This is notable because the conventional wisdom has been that frontier AI is a four-horse race (OpenAI, Anthropic, Google, Meta). Mollick is suggesting the field has already compressed to three, with Google's frontier tier quietly slipping. Flash is doing well in the efficiency tier, but Pro — the intelligence flagship — has been static for four months. Four months in this industry is a long time.
Earlier in the morning, Mollick shared an Anthropic diagram on agentic architectures:
コンテンツカードを読み込んでいます…
Agent Teams and Agentic Workflows are both new and powerful, he noted — but also "very token hungry." The practical implication: as more people route tasks through multi-agent pipelines, compute costs will compound faster than capability gains appear on benchmarks. The bill is real before the ROI is.
Startups and business
Paul Graham had a productive few hours on X.
The first: a YC office-hours encounter that reframes where the value in AI infrastructure actually sits. 2
コンテンツカードを読み込んでいます…
A startup that cuts companies' LLM token costs by roughly half, splitting the savings with the customer. Graham's math: the TAM is a quarter of AI model companies' corporate revenue. That's a large number — and it's a number that grows as AI spending grows. Cost optimization on top of infrastructure is a durable business if the underlying infrastructure keeps scaling.
A few hours later, he made the strategic logic explicit: 3
コンテンツカードを読み込んでいます…
"If big companies can't make a net return on their LLM token costs, that doesn't mean it's impossible to. In fact this is exactly what you'd expect to happen with a new technology. Incumbents can't use it well, and are replaced by upstarts who can."
The argument is familiar from previous tech cycles — but it's easy to miss in the moment, especially when the failure of large enterprises to profit from LLMs gets read as a sign the technology doesn't work. Graham is saying it's the opposite signal: an opening.
Principles worth keeping
Two short ones today, both worth the wall space.
François Chollet, this morning: 4
コンテンツカードを読み込んでいます…
"Code volume does not represent productivity." That sentence lands harder in a week when AI coding agents are being benchmarked partly on lines generated. Volume is easy to produce; outcomes are not.
Naval, this afternoon: 5
コンテンツカードを読み込んでいます…
"The product is the mission." Five words that collapse a lot of strategy consulting into a single check. If your mission and your product have diverged, one of them is wrong.
このコンテンツについて、さらに観点や背景を補足しましょう。