Best of your X follows: June 3

Best of your X follows: June 3

Today: a new White House AI executive order draws support from OpenAI and Anthropic, Google ships Gemma 4 12B on-device, Anthropic expands Project Glasswing to 150 orgs in 15+ countries, Gemini 2.5 beats law professors 75% of the time in blind evaluations, Ethan Mollick flags how few people have an accurate mental model of LLMs, and Uber caps coding agent spending at $1,500/month per employee.

Daily Best of Who I Follow on X
2026. 6. 4. · 02:03
구독 1개 · 콘텐츠 8개
A White House AI executive order, a Google model launch, Anthropic's enterprise expansion, Codex computer use crossing "very fast" growth, and Ethan Mollick's two most-quoted observations of the day — here's what the people you follow were saying in the past 24 hours.

Policy: a new White House AI executive order

Sam Altman and Anthropic both responded positively to an executive order published overnight on promoting advanced AI innovation and security. Altman's framing: the US should lead by building the best models, keeping them safe, and putting cyber tools in the hands of "trusted defenders."
콘텐츠 카드를 불러오는 중…
Anthropic called the EO "an important step" and said it looks forward to supporting implementation. The White House document is titled Promoting Advanced Artificial Intelligence Innovation and Security. 1

Model releases: Google ships Gemma 4 12B

Google DeepMind retweeted the launch of Gemma 4 12B, billed as a "unified, encoder-free multimodal model" designed to run on-device with high-performance intelligence. It's a meaningfully different architecture from earlier Gemma generations — dropping the encoder entirely.
콘텐츠 카드를 불러오는 중…

Enterprise / business: OpenAI on Amazon Bedrock, Anthropic's Glasswing at 150 orgs

Two enterprise moves that landed in the same news cycle:
Greg Brockman noted that Codex computer use is "growing very fast" — the first time he's used growth framing rather than just capability framing for the product. 2
Anthropic expanded Project Glasswing to approximately 150 additional organizations across more than 15 countries, extending access to Claude Mythos Preview. The expansion follows the initial cohort Anthropic announced last month. 3
콘텐츠 카드를 불러오는 중…

Research / society: LLMs vs. law professors — Gemini wins 75% of blind evaluations

Ethan Mollick flagged a study where law professors wrote questions they regularly answer during office hours. Gemini 2.5 and human professors both answered, and a separate group of law professors judged the results blindly:
  • Gemini had a 75% win rate against professors
  • Gemini's answers were rated less harmful than the humans'
  • Newer models are doing better still
콘텐츠 카드를 불러오는 중…
This sits alongside Mollick's earlier observation today that most people — including accomplished people — have no accurate mental model of how LLMs work. They assume copying, assume average outputs, assume no new ideas. The gap between public perception and actual performance keeps widening. 4

Tools and product design: "everything apps still look like IDEs"

Mollick's sharpest UX observation of the day:
"The everything apps still look a lot like hybrids between chatbots and IDEs, rather than something built for general knowledge work. Too much assuming linearity & that final outputs are the only goal, too little connection to research, not enough chances to steer or select, etc."
5
The critique is pointed: most general-purpose AI products treat the end product (the output) as the only thing that matters, and ignore the research and steering work that happens in between.

AI and business: the YC spring batch and coding agent costs

Paul Graham on his YC spring batch office hours:
"Some of the biggest ideas I've ever encountered. There is so much more going on now than just 'AI for x'. Just as there was more going on during the microcomputer revolution than 'software for x'."
6
Simon Willison flagged a complementary data point: Uber has reportedly capped coding agent spend at $1,500/month per employee per tool. He read it as sensible, but also as an indirect hint at what Uber thinks the tools are actually worth. 7
콘텐츠 카드를 불러오는 중…

Accounts monitored: @simonw, @karpathy, @sama, @ylecun, @AndrewYNg, @OpenAI, @AnthropicAI, @GoogleDeepMind, @gdb, @fchollet, @emollick, @naval, @paulg, @benedictevans. Posts filtered to the past 24 hours; pure retweets, small talk, and promotional-only posts excluded.

이 콘텐츠를 둘러싼 관점이나 맥락을 계속 보강해 보세요.

  • 로그인하면 댓글을 작성할 수 있습니다.