Skip to content

Plutonic Rainbows

Fast Lanes and Locked Gates

Within five days of each other, Anthropic launched Opus fast mode and OpenAI shipped Codex-Spark. Same thesis, different silicon. Anthropic squeezes 2.5x more tokens per second out of Opus 4.6 through inference optimisation. OpenAI distills GPT-5.3-Codex into a smaller model and runs it on Cerebras wafer-scale hardware at over a thousand tokens per second. Both are research previews. Both are gated to developers. Both cost more than their standard counterparts.

The timing isn't coincidence. Coding agents are the first workload where latency translates directly into revenue. A developer staring at a terminal while an agent loops through forty tool calls doesn't care about cost per token — they care about wall-clock minutes. Anthropic charges six times the standard rate for fast mode. OpenAI hasn't published Spark pricing yet, but the Cerebras partnership wasn't cheap. These aren't loss leaders. They're premium tiers aimed at the one audience willing to pay for speed right now.

What interests me is the constraint both companies are accepting. Fast mode is Opus with the same weights, just served differently. Codex-Spark is a distilled, smaller model — OpenAI admits the full Codex produces better creative output. Neither approach is free. You either pay for dedicated inference capacity or you trade quality for velocity. There's no trick that makes frontier intelligence and sub-second latency coexist cheaply.

The question everyone keeps asking — will these become generally available? — misframes the situation. The technology already works. The bottleneck is economics. Anthropic can't offer fast mode to every Claude consumer at six times the compute cost without either raising subscription prices or eating the margin. OpenAI can't run every ChatGPT conversation through Cerebras wafer-scale engines. The hardware doesn't exist in sufficient quantity. Their own announcement says they're ramping datacenter capacity before broader rollout.

So the honest answer is: speed tiers will generalise, but slowly, and probably not in the form people expect. I'd bet on tiered pricing spreading across the consumer products — a fast toggle in Claude.ai, a "turbo" option in ChatGPT — before the end of the year. But it'll cost extra. The idea that baseline inference gets dramatically faster for free requires either a hardware miracle or margins that neither company can sustain.

The deeper pattern is what I wrote about last month. Speed is becoming the axis of competition because capability gains have slowed enough that users notice latency before they notice intelligence improvements. When both labs ship speed products in the same week, that tells you where the demand signal is loudest. Not smarter. Faster.

Sources:

The Loop That Writes Itself

GPT-5.3-Codex helped debug its own training. OpenAI said it plainly: "the first model that was instrumental in creating itself." That was ten days ago. This week, ICLR announced their first workshop dedicated entirely to recursive self-improvement, scheduled for Rio in April. Google's AlphaEvolve already discovered algorithmic improvements that beat Strassen's fifty-six-year-old matrix multiplication record. The pieces are landing on the board faster than anyone expected.

Recursive self-improvement — systems that modify their own code, weights, prompts, or architecture to become more capable, then use that increased capability to improve themselves further — has been a thought experiment for decades. Eliezer Yudkowsky warned about it. Nick Bostrom built philosophical scaffolding around it. And for most of that time it remained comfortably theoretical because the systems weren't good enough at the one thing the loop requires: writing better software than the software that already exists.

That constraint is dissolving. Not because we've achieved some sudden breakthrough in machine consciousness or general reasoning, but because the narrow version of self-improvement turns out to be enough to matter. A model doesn't need to understand itself philosophically to optimise its own training pipeline. It just needs to be good at code. And the current generation is good at code.

The METR data makes the trajectory explicit. AI task-completion horizons have been doubling every four to seven months — depending on which estimate you trust — for the past six years. If that holds for another two years, we're looking at agents that can autonomously execute week-long research projects. Another four years and it's month-long campaigns. The trend line itself isn't the alarming part. The alarming part is that the trend doesn't need to hold perfectly. Even if progress halves, the capability gap closes on a timeline measured in quarters, not decades.

Dean Ball put it starkly in his recent analysis: America's frontier labs have begun automating large fractions of their research operations, and the pace will accelerate through 2026. OpenAI envisions hundreds of thousands of automated research interns within nine months. Dario Amodei cites 400% annual efficiency gains from algorithmic advances alone. These aren't wild extrapolations from startup pitch decks. These are the people running the labs describing what they see happening inside their own buildings.

However. There's a constraint that rarely gets enough attention in the acceleration discourse. Self-improvement only generates reliable gains where outcomes are verifiable. Code that passes tests. Algorithms with measurable performance. Training runs with clear loss curves. The loop works brilliantly in these domains because you can tell whether the modification actually helped. The system generates a change, measures the result, keeps or discards. Simple evolutionary pressure.

The loop breaks — or at least stumbles badly — when it encounters domains where verification is ambiguous. Alignment research. Safety evaluation. Novel hypothesis generation. The things that arguably matter most for whether recursive self-improvement goes well or catastrophically. A system can optimise its own matrix operations all day. Whether it can meaningfully improve its own ability to recognise its blind spots is a much harder question, and I suspect the honest answer is no.

So when will genuine recursive self-improvement arrive? It depends on what you mean. The narrow version — models improving their own infrastructure, training pipelines, and deployment tooling — is already here. GPT-5.3-Codex is doing it in production. The medium version — agents that systematically discover architectural improvements and better training recipes — is probably twelve to eighteen months out, conditional on the METR trendline holding. The strong version — a system that improves its own reasoning capabilities in open-ended domains, including the ability to improve its ability to improve — remains genuinely unclear. I'm not confident it's five years away. I'm not confident it's twenty.

What I am confident about is that we'll get the narrow and medium versions before we have any serious framework for governing them. The ICLR workshop is a start — researchers trying to make self-improvement "measurable, reliable, and deployable." But the gap between academic workshops and deployed production systems has never been wider. OpenAI shipped a self-improving model before anyone published a standard for evaluating self-improving models. That ordering tells you everything about the incentive structure.

The Gödel Agent — a system that modifies its own task-solving policy and learning algorithm — climbed from 17% to 53% on SWE-Bench Verified. SICA did something similar. These are research prototypes, not products, but the delta between prototype and product in this field is about eighteen months and shrinking. Probably less now that the prototypes can help close the gap themselves.

I keep coming back to something Ball wrote: the public might not notice dramatic improvements, dismissing them as "more of the same empty promises." That feels backwards to me. The risk isn't that progress will be invisible. The risk is that it'll be visible to the people building it, acting on it, profiting from it — and invisible to everyone else until the loop is already running too fast to audit.

Sources:

Forty-Seven Percent Would Rather Not

Nearly half of British sixteen-to-twenty-one-year-olds told the BSI they'd prefer to have grown up in a world without the internet. Forty-seven percent. Not a fringe opinion from technophobes or Luddites — a near-majority of the generation that never knew anything else.

The rest of the numbers are worse. Sixty-eight percent said they felt worse about themselves after spending time on social media. Forty-two percent admitted to lying to their parents about what they do online. Forty percent maintain a decoy or burner account. Eighty-five percent of young women compare their appearance and lifestyle to what they see on their feeds, with roughly half doing so often or very often. These aren't edge cases. This is the baseline experience.

What strikes me isn't the individual statistics — we've had versions of these figures for years. Back in 2018, Apple's own investors were pressuring the company over youth phone addiction, citing surveys where half of American teenagers said they felt addicted to their devices. Seven years later, nothing structural changed. The platforms got stickier. The algorithms got sharper. The age of first exposure dropped. And now the generation that grew up inside the experiment is telling us, plainly, that they wish the experiment hadn't happened.

Fifty percent of respondents said a social media curfew would improve their lives. Twenty-seven percent wanted phones banned from schools. Seventy-nine percent believed tech companies should be legally required to build privacy safeguards. That last number is the one I keep returning to — four out of five young people asking for regulation that adults have spent a decade failing to deliver.

The BSI's chief executive, Susan Taylor Martin, put it in corporate language: "The younger generation was promised technology that would create opportunities, improve access to information and bring people closer to their friends." The research, she said, shows it is "exposing young people to risk and, in many cases, negatively affecting their quality of life." This is what institutional understatement sounds like when the data is screaming.

There's an uncomfortable parallel with how the AI industry is repeating social media's mistakes — the same pattern of externalised harm and internalised profit, the same rehearsed contrition at hearings, the same gap between stated commitments and actual behaviour. The platforms knew what they were doing to adolescents. Internal documents confirmed it. Nothing changed because engagement metrics drove revenue, and revenue was the only number that mattered in the boardroom.

Forty-three percent of the respondents started using social media before the age of thirteen — the legal minimum. Not because their parents approved, but because the platforms made it trivially easy to lie about your age. Then those same platforms sold advertising against the attention of children who shouldn't have been there in the first place.

The generation that was supposed to be "digital natives" — fluent, empowered, connected — is telling us they'd trade it all for something quieter. We should probably listen.

Sources: