May 28, 2026AIdistillation

The Model Behind the Model

Every time a big AI lab ships a model, it publishes a "system card": a long, dry technical report on what the model can do, how it behaves, and where it might be dangerous. Almost nobody reads them all the way through. I do, because the interesting things tend to hide in the parts that never make the announcement post.

Anthropic released Claude Opus 4.8 today,¹ and I read the whole thing. The model is excellent: fast, a real step up at coding, the kind of upgrade you feel in the first hour. But I came away thinking less about Opus 4.8 than about a different model: one you can't use, that's mentioned on nearly every page, and that Anthropic has been talking about publicly for weeks.

The model behind the curtain

It's called Claude Mythos Preview.

You won't find it in the app; it's "not publicly available."² But it's the yardstick the whole card is measured against, and it's always a step ahead of Opus 4.8 on basically everything. Anthropic calls it "our current most capable model."³

And this isn't a secret model; Anthropic has been openly discussing it. It's the engine behind Project Glasswing, their cybersecurity consortium, and it has its own write-up. Mythos sits a full capability tier above Opus 4.7, and Anthropic has been clear about why it stays locked up: it's good enough at security to autonomously find and weaponize zero-day vulnerabilities, so releasing it openly would make large-scale cyberattacks meaningfully more likely.⁴ So Mythos staying in the back isn't a mystery; it's too dangerous to hand out, and that's completely understandable.

What's interesting to me isn't that they hold it back. It's how they turn it into something they can release.

What "distillation" is

Picture the smartest professor you've ever met: brilliant, but slow, expensive, and impossible to clone. Now sit an apprentice beside them for a year with one job: watch every answer the professor gives, and learn to reproduce it. The apprentice never gets quite as good, but they get shockingly close, and they're cheap, fast, and easy to copy a thousand times.

That's distillation. You take a huge, expensive "teacher" model and train a smaller "student" model to imitate its outputs; the student inherits most of the teacher's knowledge and behavior without the teacher's size or cost. The idea goes back to a 2015 paper from Geoffrey Hinton's group, and it's standard practice; it's how the cheap, fast versions of these models get built. Anthropic says so itself: their Haiku model was made by "transferring knowledge from the 'teacher' (Claude 3.5 Sonnet) to the 'student' (Claude 3 Haiku)."

Why I think Opus 4.8 is the student

A few things, read together, convinced me.

The capabilities all converge on Mythos. Opus 4.8 never beats it on raw ability; it just sits right underneath it on nearly every measure.⁵ In fact, on metric after metric it lands almost exactly between its predecessor, Opus 4.7, on one side and Mythos Preview on the other:⁶

Metric	Claude Opus 4.7	Claude Opus 4.8	Claude Mythos Preview
Anthropic capability index (AECI)	154.1	155.5	158.3
CyberGym, vulnerabilities solved	73.1%	78.8%	83.1%
ExploitBench (AutoNudge, out of 16)	3.66	5.45	9.90
Firefox, full working exploits	1.2%	8.8%	70.8%
DeepSearchQA (F1)	89.4%	93.1%	94.4%
DRACO (normalized)	77.7	80.4	83.7
Constitution endorsement (0 to 10)	7.6	7.9	8.3

A weaker model is normal. What's striking is that its personality converges too: the card says Opus 4.8 "shows a similar profile to our best-aligned model, Mythos Preview."⁷ A smaller model that quietly inherits a specific bigger model's whole way of behaving is exactly what a distilled student looks like.

Then there's the fine print. Opus 4.8 was trained on "synthetic data generated by other models."⁸ Which models? The most capable one in the building was Mythos.

And honestly, Anthropic more or less pre-announced this. When they launched Glasswing, they said the safeguards needed to deploy Mythos-class capability safely "will be launched with an upcoming Claude Opus model."⁴ That upcoming model is the one that just shipped. Opus 4.8 reads like the safe, public, distilled step down from a model they were never going to release directly.

The part I keep thinking about

Here's the twist. Distilling a frontier model's outputs into a cheaper one is the exact thing DeepSeek got accused of doing to OpenAI in early 2025, and when a rival does it through your API, the U.S. government and the labs called it theft, even a national-security problem.⁹ Anthropic appears to have used the same fundamental technique (train a smaller model on a bigger model's outputs), except pointed inward, at its own model. Same trick, very different framing: when a competitor does it, it's an attack; when you do it to yourself, it's a pipeline.

And maybe that pipeline is the real story, the shape of how frontier AI gets built from here. Take a giant leap in private (Mythos). Then distill it downward in public steps, each one tuned to the price point you can afford to serve, the safety bar you're willing to clear, and the capabilities you're comfortable exposing, until it's ready to ship. The frontier model and the product model drift apart on purpose. We argue endlessly about "how good is AI right now," and we're arguing about the student. The teacher never comes to class.

Opus 4.8 is a wonderful model. I just can't stop seeing it as a deliberately throttled portrait of one we'll never get to meet, and the gap between the model behind the curtain and the one in your chat window is the game the labs are now quietly playing with the public.

A note on how this was made: the thoughts, arguments, and conclusions here are entirely my own. I fed them through my own automated AI writing system, which drafted and assembled the post from my notes.

Anthropic, System Card: Claude Opus 4.8, May 28, 2026. ↩
System Card: Claude Opus 4.8, p. 53 ("As Mythos Preview is not publicly available..."); see also p. 67. ↩
System Card: Claude Opus 4.8, p. 44. ↩
Anthropic, Project Glasswing and Claude Mythos Preview. Anthropic states that Mythos sits a capability tier above Opus 4.7, that it does not plan to release it generally, and that safeguards for deploying Mythos-class models "will be launched with an upcoming Claude Opus model." ↩ ↩²
System Card: Claude Opus 4.8, p. 42 (Opus 4.8 "sits between Claude Opus 4.7 and Claude Mythos Preview ... and does not advance our capability frontier"). ↩
All figures verified against System Card: Claude Opus 4.8: Anthropic capability index (AECI), p. 42; CyberGym pass@1, p. 50; ExploitBench AutoNudge (mean flags, max 16), p. 49; Firefox full-working-exploit rate, p. 52; DeepSearchQA F1, p. 204; DRACO normalized score, p. 207; constitution endorsement, p. 188. Bold marks Opus 4.8 sitting between Opus 4.7 and Mythos Preview on every row. ↩
System Card: Claude Opus 4.8, p. 2. ↩
System Card: Claude Opus 4.8, p. 10 ("a proprietary mix of publicly available information from the internet, public and private datasets, and synthetic data generated by other models"). ↩
Beatrice Nolan, "DeepSeek used OpenAI's model to train its competitor using 'distillation,' White House AI czar says," Fortune, Jan 29, 2025. ↩

The model behind the curtain

What "distillation" is

Why I think Opus 4.8 is the student

The part I keep thinking about

Footnotes