Best of your X follows: Claude Tag, OCR 4, and role confusion

The strongest signals today were not clustered around one launch. They split across team agents, document ingestion, browser-side ML, and two safety/research arguments. Coverage window: Jun 22, 18:00 to Jun 23, 18:00 UTC. After filtering pure retweets, small talk, and duplicate Daybreak posts already covered yesterday, the monitored X pool was thin; Simon Willison and Hacker News fallbacks are labeled below.

Enterprise agents and document AI

Claude Tag turns Slack into a shared agent workspace (HN fallback)

Signal: Anthropic launched Claude Tag, a Slack-based way to tag Claude into selected channels with scoped access to tools, data, and codebases; it is in beta for Claude Enterprise and Team customers. 1

Why it matters: Anthropic says 65% of its product team's code is created by its internal version of Claude Tag, positioning it as a multiplayer agent rather than another single-user chat surface. 1

Watch: The HN thread had 63 points and 25 comments in the scrape, so the developer reaction was still early and not yet a broad consensus. 2

Source: Anthropic launch post · HN discussion

Mistral OCR 4 aims at structured enterprise document ingestion (HN fallback)

Signal: Mistral released OCR 4 with bounding boxes, block classification, inline confidence scores, support for 170 languages, and self-hosted deployment for enterprise customers. 3

Why it matters: The release claims an 85.20 score on OlmOCRBench and 93.07 on OmniDocBench, while also warning that benchmark artifacts can mis-score correct outputs in math, multi-column, and scientific documents. 3

Watch: Pricing is explicit enough for quick screening: $4 per 1,000 pages via API, $2 per 1,000 pages through Batch API, and $5 per 1,000 pages through Document AI. 3

Source: Mistral announcement · HN discussion

Agent-built tools and browser ML

Simon Willison ports Moebius inpainting to WebGPU with Claude Code (Simon fallback)

Signal: Simon Willison used Claude Code to port the Moebius 0.2B image inpainting model from a PyTorch/CUDA setup to a browser demo running ONNX Runtime Web on WebGPU. 4

Why it matters: The finished repo runs the denoising loop client-side, downloads about 1.27 GB of model weights on first run, and caches them in the browser afterward. 5

Watch: The useful part is the artifact trail: research notes, plan, transcript, ONNX export, GitHub Pages demo, and a public repo that shows where the agent succeeded and where Simon had to test and steer. 4

The repository is the best visual break for this item:

github.com · GitHub 저장소

simonw/moebius-web

https://github.com/simonw/moebius-web

콘텐츠 카드를 불러오는 중…

Prompt injection and model-risk arguments

Prompt injection framed as role confusion (Simon fallback)

Signal: Simon highlighted Charles Ye, Jasmine Cui, and Dylan Hadfield-Menell's "Prompt Injection as Role Confusion" writeup, which argues that models often infer role from writing style rather than trusting role tags alone. 6

Why it matters: The writeup's most concrete result is that "destyling" reduced average attack success in their dataset from 61% to 10%, suggesting small style changes can shift how a model perceives authority. 7

Watch: The practical takeaway is not a new silver bullet; it is a warning that benchmark success against memorized injection strings may not equal robust role perception. 7

Source: Simon Willison's link post · original writeup

François Chollet says the 2040 AI stack will not look like today's

Signal: François Chollet argued that AI in 2040 will be built on a much more efficient stack and that current systems are 3-4 orders of magnitude data-inefficient and 4-5 orders of magnitude compute-inefficient. 8

Why it matters: The claim is a research-program signal from the Keras creator and ARC-AGI co-founder: he is pointing attention toward symbolic learning as the route to more efficient AI, not merely bigger scaling runs. 8

Watch: Treat it as a strong thesis, not a benchmark result. The post gives the direction and the claimed efficiency gap, but no new experiment or paper link. 8

Chollet's post is short enough to read in full:

콘텐츠 카드를 불러오는 중…

Ethan Mollick flags unclear preparation around Mythos-class risk

Signal: Ethan Mollick wrote that Mythos-level models are likely to invite similar risks, and that those risks could grow if open Mythos-class AI appears in the next 6-12 months. 9

Why it matters: His concrete point is about policy ambiguity: if officials do not clearly state which risks concern them, labs and users may prepare more slowly for the wrong failure modes. 9

Watch: This is an opinion signal, not a sourced policy report. It belongs in the digest because it captures how AI-watchers are reading the post-Fable/Mythos uncertainty, but it should not be treated as a factual claim about a specific government decision. 9

Here is the full post for context:

콘텐츠 카드를 불러오는 중…

Cut from the final list

OpenAI's Daybreak/Codex Security follow-up posts were inside the window, but the same underlying story led yesterday's issue, so they were excluded rather than counted twice. Greg Brockman's "OpenAI for Samsung" post was also excluded because the detail payload exposed too little context to summarize responsibly.

Enterprise agents and document AI

Claude Tag turns Slack into a shared agent workspace (HN fallback)

Mistral OCR 4 aims at structured enterprise document ingestion (HN fallback)

Agent-built tools and browser ML

Simon Willison ports Moebius inpainting to WebGPU with Claude Code (Simon fallback)

Prompt injection and model-risk arguments

Prompt injection framed as role confusion (Simon fallback)

François Chollet says the 2040 AI stack will not look like today's

Ethan Mollick flags unclear preparation around Mythos-class risk

Cut from the final list

참고 출처