Four Point Seven Months

The UK's AI Security Institute published a short blog post yesterday afternoon that I think is the most consequential thing anyone has written about frontier models this month. It is not an announcement. It is a number, and a story about why the number keeps moving.

In November 2025, AISI estimated that the length of cyber tasks frontier AI models could autonomously complete was doubling every eight months. That was already, by anyone's standard, fast. Three months later, in February, the institute reran the analysis with fresher data and revised the doubling time to 4.7 months. The acceleration was itself accelerating. Yesterday's update is that even 4.7 months now looks too generous. Claude Mythos Preview and OpenAI's GPT-5.5 have both substantially exceeded the trend the institute was tracking, and AISI is openly unsure whether what it is watching is an isolated jump or the start of a new, faster trajectory.

The concrete data sits underneath the abstraction. AISI runs a cyber range called The Last Ones, a 32-step simulated corporate network attack that, until this round of testing, no model had made meaningful progress on. Claude Mythos Preview solved it on six of ten attempts. GPT-5.5 solved it on three of ten. Mythos also returned the first non-zero score on a second range called Cooling Tower, succeeding three times in ten. Two evaluations that were near-zero a few months ago are now in the high single digits, and the institute's own framing of the trend has shifted from "fast" to "we are not sure what we are looking at any more."

What I keep coming back to is the shape of the announcement. AISI was set up to be a calm, slow, scientific counterweight to the marketing claims of the labs. It runs evaluations, it publishes priors, it revises them when the data demands. Twice in six months it has had to revise its own central estimate sharply downward. That is not a failure on AISI's part. It is the institute working as designed. It is also, however, a signal that the institution best positioned to tell a government when to slow down is now telling that government, repeatedly, that the thing it is measuring is faster than the last time it was measured.

Palo Alto Networks, working under Anthropic's Project Glasswing partnership, reached similar conclusions through its own testing, which CyberScoop covered yesterday. The two reports landed on the same day, from different methodologies, with the same finding. Frontier models are now extraordinarily capable at finding vulnerabilities and turning them into exploit paths in close to real time. The wording varies. The line on the graph does not.

What I notice is the asymmetry of who can act on the number. AISI can publish. The labs can choose to deploy or hold. The British government can negotiate access. None of them, working alone, can slow the curve. The curve is the product of a hundred decisions being made in parallel by people who do not report to AISI and who do not particularly care whether the doubling time is eight months or 4.7 months or, the next time the institute reruns the analysis, something shorter still. The useful question is not what the doubling rate is. It is what anyone is going to do once the rate stops being information and becomes the thing the institute is reporting on for its own sake.

Sources:

How fast is autonomous AI cyber capability advancing? — AI Security Institute
Researchers say AI just broke every benchmark for autonomous cyber capability — CyberScoop

Plutonic Rainbows

Four Point Seven Months

Related Entries