Skip to content

Plutonic Rainbows

The Metabolic Cost of Looking Back

Certain kinds of thinking cost more than others. Mentally returning to a past moment — really returning, not just glancing — requires the mind to reconstruct something that no longer exists. The room, the light, the particular quality of a voice. When that moment carries emotional weight, the reconstruction doesn't stay intellectual. The body enters it too. Heart rate shifts. Breathing changes. The nervous system begins responding to something that isn't happening.

This is expensive.

I don't mean expensive in some vague, moralising way. Psychologists have a term for this pattern: rumination. The word comes from the digestive process of cows — chewing the same material over and over. When applied to thought, it describes the repetitive focus on distressing content without movement toward resolution. Research published in Stress and Health this year found that people who score high on rumination measures show exaggerated cardiovascular responses to stress and, critically, slower recovery afterward. The body stays activated longer. It doesn't settle.

There's a difference between remembering and dwelling that took me years to understand. Remembering can be reflective, even nourishing — a way of honouring what happened, integrating it, letting it inform the present without dominating it. Dwelling is something else. Dwelling is immersive, comparative, and repetitive. It doesn't integrate. It displaces. The present gets evaluated not on its own terms but against a version of the past that has been retrospectively polished until it gleams.

That comparison is unwinnable.

The past you're measuring against isn't even accurate anymore. Memory doesn't archive experience faithfully. Every recollection is a reconstruction — and reconstruction favours emotional intensity over factual precision. A period that was actually mixed, containing both good and difficult moments, can crystallise into pure golden light when viewed from sufficient distance. The mundane parts drop away. What remains is the atmosphere, stripped of its complications. You end up competing with a ghost that never existed.

A study from the University of Liverpool identified dwelling on negative events as the single biggest predictor of both depression and anxiety. Not the events themselves — the dwelling. The cognitive pattern of returning again and again, generating alternatives that cannot be pursued, asking questions that cannot be answered. What if I had stayed? What if I had said something different? The brain is remarkably good at generating counterfactuals. It is remarkably bad at closing them when the alternatives are impossible. The loop has no exit.

I've been writing about memory for a while now, trying to understand why certain fragments refuse to stay in the past. Part of the answer, I think, is that emotionally vivid memories don't behave like dated entries in a calendar. They feel concurrent with the present. They resist being filed under "then." When I dwell on such a memory, I'm not looking backward at a fixed point. I'm experiencing something that seems to exist alongside now, competing for the same attention, drawing from the same limited pool of emotional energy.

And that pool is limited. Attention, once fixed, is expensive to keep fixed. Emotion that has nowhere to go — no corrective action, no completion, no resolution — exhausts rather than motivates. This is one reason beautiful memories can leave a person feeling depleted afterward. The emotion is real. The activation is real. But there's nothing to do with it. No way to act. The feeling cycles without discharge.

I should say plainly: I don't think any of this means the past should be ignored or that reflecting on difficult memories is inherently harmful. The problem isn't memory. The problem is a specific relationship to memory — one characterised by repetition without integration, by comparison without acceptance, by emotion without agency. The psychological literature calls this "brooding" as opposed to "reflective pondering." Brooding predicts worse outcomes. Reflective pondering can actually help.

The distinction is subtle but feels obvious once you notice it. Reflective pondering asks what happened and what it means. Brooding asks why this happened to me and whether it could have been different. One moves toward understanding. The other moves toward a wall.

Some of the fatigue, I suspect, comes from temporal misallocation of meaning. When a specific period of the past comes to carry disproportionate emotional weight, the present is quietly stripped of legitimacy. New experiences feel thin because they're not allowed to matter in the same way. They're measured against something that has been idealised through distance and repetition. Even neutral or potentially good moments struggle to register because attention has been monopolised elsewhere.

I notice this in myself. There are stretches of time when my present life is fine — genuinely fine, not pretending — but a certain flavour of memory keeps surfacing, and each surfacing takes something. Not much. But accumulating. Like a tax on attention. After a day of this, I'm tired in a way that doesn't correspond to what I've actually done. The body knows it has been working even if the work is invisible.

Recent research in Frontiers in Psychology found that fatigue itself can trigger rumination, creating a feedback loop. Tired people dwell more. Dwelling makes people tired. The cycle reinforces itself. Breaking out requires noticing the pattern — recognising when remembering stops adding depth and starts extracting vitality. That recognition doesn't fix anything by itself, but it marks the point where awareness begins to replace compulsion.

Self-compassion appears to help. Not in the sense of empty reassurance, but in the sense of treating yourself with the same patience you'd offer someone else caught in the same loop. A study published in Nature this year found that self-compassion mediates the relationship between self-critical rumination and anxiety. Which is to say: how you relate to the pattern matters as much as the pattern itself. Beating yourself up for dwelling only adds another layer to the thing you're dwelling on.

I'm not sure I've gotten better at this. I've gotten better at noticing it, which is something. When I catch myself returning to the same moment for the third or fourth time in a day, I can sometimes name what's happening: reconstruction is active, the body is responding to something that isn't here, energy is being spent on a comparison I cannot win. Naming it doesn't stop it. But naming it creates a small gap between the experience and my identification with it.

The hard part — the honest part — is accepting that some memories will keep arriving whether I want them to or not. They'll bring their weather with them. The question isn't how to make them stop. The question is whether I let them run the whole day or whether I can acknowledge their arrival and then, with effort, redirect attention to something I can actually affect.

Some days I manage. Some days I don't.

Sources:

Where Claude Code Goes From Here

The 2.1.0 release landed a few days ago, and buried in the changelog are some features that hint at where Anthropic is taking this thing. Session teleportation — the ability to resume a terminal session at claude.ai/code using /teleport — sounds like a convenience feature until you realise what it actually enables. I can start something complex on my laptop, close the lid, and pick it up on my phone later. The session state persists somewhere in Anthropic's infrastructure, waiting.

This feels like the beginning of something larger. The pattern I'm seeing across recent releases suggests Anthropic is building toward persistent agents that survive individual sessions. Not just chat history — actual running context that carries forward. The hooks system they added for skills and agents points in the same direction. You can now define PreToolUse and PostToolUse logic that scopes to specific contexts. That's infrastructure for agents that remember what they were doing and why.

The Chrome integration is interesting too. Beta, obviously, but the idea of controlling a browser directly from the terminal opens up workflows I hadn't considered. Automated testing that actually sees the page. Form filling. Screenshot analysis. It's not that any individual capability is new — it's that they're converging into something more coherent.

I'm not sure Anthropic has figured out where the boundaries should be. The Explore subagent, which uses Haiku to search codebases efficiently, saves context by doing lightweight reconnaissance before committing the main model's attention. Smart, but it also means decisions about what's relevant happen outside my visibility. Sometimes it finds exactly what I need. Sometimes it misses something obvious because the cheaper model didn't recognise its importance. The tradeoff makes sense economically; I'm less certain it makes sense epistemically.

What I'm watching for next: multi-session orchestration. The teleportation feature only works for resuming a single session right now. But the infrastructure clearly supports more than that — spawning background agents that report back, coordinating work across multiple contexts, that sort of thing. Cowork plugins already hint at this. Companies can apparently build internal plugin catalogs now. The pieces are assembling.

My guess — and this is speculation — is that Anthropic ships proper agent orchestration within the next few months. Not as a separate product, but as an extension of what Claude Code already does. The session teleportation, the hooks system, the subagent architecture: these aren't random features bolted on. They're scaffolding for something more ambitious. Whether that ambition lands gracefully or creates new categories of confusion remains to be seen. The history of agentic AI is littered with impressive demos that fell apart in production.

For now, I'm mostly pleased with where things are. The asking-too-often problem hasn't disappeared, but the tool has gotten better at knowing when to just proceed. The codebase search actually works. The Chrome stuff is rough but promising.

Sources:

Wonder 2 Finally Shows Some Restraint

The original Wonder model turned everything into watercolour. Skin looked airbrushed, fabric lost its weave, and faces — especially small ones in group shots — came out waxy. I ran a batch of family photos from the early 2000s through it last year and the results were unusable. Everyone looked like they'd been smoothed in Photoshop by someone who'd just discovered the blur tool. I stopped using it.

Wonder 2 is different. Topaz finally acknowledged what users had been complaining about: the over-processing. The new model dials back the artificial sharpening and actually preserves texture. Hair looks like hair. Skin has pores. Fabric keeps its weave instead of melting into some vague suggestion of cloth.

I tried the same batch again. The difference is significant. Not perfect — I'm not sure any upscaler handles compression artefacts from early digital cameras gracefully — but the faces are recognisably human now. That waxy sheen is gone.

The catch: it's cloud-only. Topaz says the computational demands are too heavy for local processing, which is probably true, but it means you're uploading your images to their servers. For personal photos, I don't love that. For client work, some people won't accept it at all. I understand the technical reasoning — these models are enormous and running them locally would require hardware most photographers don't have — but it still feels like a step backward in terms of control.

There's also the subscription question. Topaz moved to a subscription model last year, which rubbed a lot of long-time users the wrong way. I'm not going to relitigate that argument here. The software works or it doesn't. For me, Wonder 2 works well enough that I've started using Topaz again after months of avoiding it.

What I actually wanted from an upscaler was always simple: make the image bigger without making it worse. Don't add detail that wasn't there. Don't smooth things that should be rough. Don't sharpen edges until they ring. Just scale it up and preserve what exists. Wonder 2 gets closer to that than anything else I've tried. It's not magic — you can't turn a 200x300 thumbnail into a printable image — but for moderate upscaling of decent source material, it does the job without leaving obvious fingerprints.

The Fidelity update that shipped alongside Wonder 2 includes a bunch of other models too. Recover 3 for softer results, some video stuff I haven't tested. But Wonder 2 is the one that matters to me. It's the reason I'm writing this instead of just ignoring another Topaz release.

Sources:

The Mud Remembers Everything

I found my copy of Wintering Out in a box I hadn't opened since moving flats in 2019. The spine was cracked in three places, the pages yellowed in that particular way paperbacks get when they've lived in damp rooms. I'd forgotten I owned it.

Seamus Heaney published this collection in 1972, the year of Bloody Sunday, though you wouldn't necessarily know that from reading it. The violence is there — it's always there in his work from this period — but it comes at you sideways, through bog bodies and place names and the particular way rain sounds on different surfaces. He doesn't write about soldiers. He writes about the word "Anahorish" and how it feels in the mouth.

That indirection annoyed me when I first read him. I was twenty-two, impatient, wanted poets to say what they meant. Why all this business about etymology and townlands when people were dying? It felt like evasion. I put the book aside and didn't pick it up again for over a decade.

I was wrong.

The poems in Wintering Out aren't avoiding the Troubles — they're excavating the ground beneath them. When Heaney writes about the word "Broagh" and how the "gh" sound at the end is unpronounceable for English speakers, he's writing about borders. About who belongs and who doesn't. About how language itself draws lines that bodies later bleed across. This isn't evasion. It's archaeology.

"The Tollund Man" is the poem everyone talks about, and for good reason. A body preserved in a Danish bog for two thousand years, sacrificed to some fertility goddess, becomes a lens for looking at sectarian murder in Belfast. The logic shouldn't work. Denmark isn't Ireland, and ritual killing isn't the same as a car bomb. But Heaney makes the connection feel inevitable rather than forced. Both are forms of tribal violence. Both leave bodies in the earth.

I keep thinking about my grandmother's accent. She grew up in Tyrone, moved to England in the fifties, and by the time I knew her, her voice had become something strange — not quite Irish, not quite English, caught between. She pronounced certain words in ways I've never heard anyone else pronounce them. When she died, those pronunciations died with her. That's what Heaney is getting at in poems like "Traditions" — language as inheritance, but also language as loss. Every generation forgets something.

He wrote much of this collection while on sabbatical at Berkeley in 1971. California, of all places. He said the distance loosened something in his form, made the quatrains more relaxed. I find that odd — that you'd need to go to the other side of the world to write about the six inches of soil beneath your childhood home. But maybe that's exactly right. Too close and you can't see it. The Irish memory bank, he called it. Something you can only access from far away.

The shorter poems frustrate me. "Servant Boy" and "Limbo" feel slight next to the longer pieces, sketches rather than finished work. Critics at the time complained that Heaney wasn't addressing the violence directly enough, and I understand the impulse even if I think they were wrong. When your country is tearing itself apart, poems about place names can feel like fiddling while Rome burns.

But that misses what Heaney understood: the violence didn't come from nowhere. It grew from centuries of contested ground, contested language, contested memory. You can't address the present without digging into what made it. The bog preserves everything — bodies, butter, wooden trackways. It's a kind of memory that doesn't forget. Heaney keeps returning to that image because it's doing real work for him. The past isn't past. It's right there, just under the surface, waiting to be cut into.

My copy still smells faintly of the flat I lived in during my twenties. Damp plaster, radiator dust, the particular staleness of single-glazed windows. I don't know why I kept it through three moves. Most of my books from that period went to charity shops or got left on trains. This one survived.

Harold Bloom called Heaney's voice "keyed and pitched unlike any other significant poet at work in the language anywhere." That's the kind of sentence critics write when they can't quite explain what they mean. But he's not wrong. There's something in the sound of these poems — the vowels, the consonant clusters, the way lines break mid-phrase — that doesn't sound like anyone else. You can recognise a Heaney poem by its music before you've parsed a single image.

The collection ends with "Westering," a poem about California, about being far from home, about the direction of travel that the word itself implies. West into the unknown. West into the sunset. West into America, where so many Irish ended up. It's not a conclusion exactly — more of a trailing off, a question left hanging. Where do you go when the ground you came from is contested? What happens to memory when it crosses an ocean?

I've started rereading the bog poems aloud. There's no other way to get them right. The sounds matter in a way that silent reading can't capture. "The Tollund Man" in particular needs to be spoken — the way "Tollund" itself echoes and dulls, the flat vowels of "the mild pods of his eyelids." Heaney was obsessed with how words feel in the body. The tongue, the teeth, the soft palate. Poetry as a physical act.

I still don't love all of it. Some of the mythological pieces feel like exercises. But the best poems here — "Anahorish," "Broagh," "The Tollund Man," "Gifts of Rain" — do something I can't quite name. They make the familiar strange and the strange familiar. They make language itself feel like archaeology, like digging.

Sources:

Forty-Five Bugs Hiding in Plain Sight

A static site generator seems like the safest possible software. No database. No user authentication. No server-side processing. Markdown goes in, HTML comes out. What could go wrong?

Quite a lot, as it turns out. I spent part of today running a systematic security audit on the Python code that builds this blog, and the results were sobering. Forty-five issues across six severity categories, ranging from XSS vulnerabilities in the search functionality to race conditions in file operations. The code has been running in production for months. Every issue had been hiding in plain sight.

The most serious problem was a classic cross-site scripting vulnerability. The search feature highlights matching text by inserting content directly into the DOM via innerHTML — a pattern that the OWASP DOM-based XSS Prevention Cheat Sheet explicitly warns against. If a malicious post title contained script tags, the search results would execute them. The fix was straightforward: escape HTML entities before highlighting. However, the vulnerability should never have existed in the first place.

Race conditions appeared in several places. The build script called os.listdir() twice for duplicate detection — once to build a normalisation map, again to process files. Between those calls, the filesystem could change. The cache file for URL shortening used a naive write pattern that could corrupt data during concurrent builds. The asset copying routine deleted the entire output directory before recreating it, creating a window where the site would be unavailable. Each fix required thinking carefully about atomicity and the assumptions that file operations make about a static world.

Date arithmetic revealed a subtler class of bug. The time-based filtering used timedelta(days=months * 30) to calculate cutoff dates — a calculation that drifts by five or six days over a year. Posts from exactly twelve months ago might or might not appear depending on which months fell within the range. The dateutil library provides relativedelta specifically to handle calendar arithmetic correctly. There was no excuse for not using it.

Path traversal prevention was missing entirely. A crafted slug containing ../ could write files outside the output directory. Input validation existed for character sanitisation but not for structural attacks. The oversight was embarrassing.

What strikes me most is that this code was written with agentic coding tools — the same tools that are supposed to bring senior-level expertise to every developer. The tools generated working code that passed all tests and produced correct output. They did not generate secure code. They did not flag the race conditions or the XSS vulnerability or the date arithmetic error. The code worked, which is a different thing from the code being right.

This reinforces something I have been thinking about: no system can verify its own blind spots. The AI that helped write the code could not see what it had missed. The developer reviewing the output — me — did not catch the issues either. Only a deliberate, adversarial audit with a checklist of known vulnerability patterns found what was hiding in plain sight.

The fixes took a few hours. The lesson will last longer. Safe-looking code is not the same as safe code. Static sites are not immune to security issues. And the tools that accelerate development do not eliminate the need for the slow, careful work of verification.

Sources:

Cleaning the Metadata

Spent the morning performing maintenance on my Roon music library—removing extended attributes from 26,948 audio files. macOS applies com.apple.quarantine and com.apple.provenance attributes to downloaded files as security measures, but these can cause file access issues with Roon's music server. The cleanup was straightforward using xattr -dr commands to recursively remove the problematic attributes. Tested playback afterward with Oneohtrix Point Never's Tranquilizer—no audio quality degradation, exactly as expected. Extended attributes are filesystem metadata stored separately from audio data itself. The files remain unchanged; only the invisible annotations have been stripped away. The library now runs cleaner without these unnecessary flags interfering with normal operation.

Low Frequency Pilgrimages Through Urban Wilderness

Waswaas and The Dullard Sage have constructed something genuinely strange with Und Ewig Ist — an album that feels less composed than discovered. The eight tracks unfold as what the artists call "low frequency field recording excursions," and that description captures the essential character of the work. This is music that moves through environments rather than describing them.

The collaboration spans territory that defies easy categorisation. Tags on Bandcamp list Sufi, cosmic black metal, drone, and modern classical as reference points. However, none of these labels fully accounts for what happens across tracks like "Disorders of Consciousness" or "Datacombs." The low end dominates — rumbling bass frequencies that seem to emanate from the earth itself — while field recordings add texture and occasional brightness to the murky depths.

The dedication to Maryanne Amacher feels particularly apt. Amacher spent decades exploring how sound interacts with physical space and the listening body. Waswaas and The Dullard Sage pursue a similar investigation, creating music that rewards deep listening and physical presence. The cassette edition sold out quickly, though digital versions remain available with bonus tracks.

Sources:

When Attars Take Flight

Sultan Pasha's decision to reformulate Thebes as an alcohol-based Extrait de Parfum marks a significant departure from the oil-based attar tradition that established his reputation. The original Thebes Grade 1 arrived in 2016 as an homage to Guerlain's discontinued Djedi — a fragrance so evocative that Sultan Pasha described it as the only perfume that had brought him close to tears. After months of painstaking recreation, he captured that spectral atmosphere in oil form, creating what became his signature composition.

Nearly a decade later, the 2025 release transforms that intimate, skin-hugging attar into something altogether different. Working alongside Christian Carbonnel under the new Sultan Pasha Perfumes label, the reformulation explores what happens when you translate oil's density and warmth into alcohol's volatility and projection. The result maintains the core narrative — an ancient Egyptian tomb, the boundary between life and death — while fundamentally altering how that story unfolds in space and time.

The composition itself reads like an exercise in controlled opposition. Bright aldehydes and a white floral bouquet of jasmine, muguet, and rose sit against somber, earthy vetiver and the distinctive chalk-like texture of genuine orris butter. Reviewers consistently note this tension: the fragrance is simultaneously luminous and gloomy, uplifting and ritualistic. One detailed review describes waves of heady florals alternating with leather and salty ambergris, creating an animalic, fatty quality that feels deliberately unsettling.

This approach differs markedly from the attar version's intimate revelation. Alcohol-based perfumes diffuse outward, creating a more public presence that transforms the wearer's relationship to the scent. Where the oil version whispered ancient secrets directly to the skin, the Extrait broadcasts them into the surrounding air. The projection reportedly remains strong for the first two hours before settling closer to the body, with longevity hovering around five hours — a relatively modest performance for an Extrait concentration, suggesting the formula prioritizes complexity over sheer endurance.

The move to alcohol represents more than technical reformulation. Sultan Pasha built his reputation through traditional attar craftsmanship, a method that demands patience and precision but limits commercial reach. Attars require direct application, careful storage, and an understanding that comes through experience. By creating alcohol-based versions of his most celebrated works, he opens a door to audiences who might find oil-based perfumes too unfamiliar or demanding.

However, this accessibility comes with artistic risks. The attar community values the medium's contemplative nature — its quiet intensity, its refusal to announce itself beyond the wearer's personal space. Translating that aesthetic into alcohol requires careful calibration to avoid losing what made the original compelling. Based on early responses, Thebes manages this balance by maintaining its strange, funereal atmosphere even as it reaches farther from the skin. The reformulation amplifies certain aspects — particularly the aldehydic brightness and floral lift — while preserving the dusty, ritualistic core that defines the concept.

Sample sets became available for preorder through January 2026, a deliberate strategy that allows serious enthusiasts to experience the full lineup before committing to full bottles. This approach respects the considered, exploratory mindset that characterizes niche perfume appreciation. These are not fragrances designed for casual purchase; they demand time, attention, and a willingness to sit with discomfort. The animalic qualities alone ensure this remains far from mainstream tastes.

What strikes me most about this release is its timing. The niche perfume market has become increasingly crowded, with countless brands claiming artisanal credentials while churning out derivative compositions. Sultan Pasha's move to alcohol could be read as capitulation to commercial pressure, but the execution suggests otherwise. By maintaining Extrait concentration and preserving the challenging, unconventional character of the original work, he signals that accessibility need not mean simplification.

The question now becomes whether this model succeeds — whether audiences accustomed to attars will embrace the reformulations, and whether those new to Sultan Pasha's work will appreciate what makes these fragrances distinctive. Thebes tests that proposition directly, offering a scent that refuses conventional pleasantness in favor of atmospheric depth. It remains to be seen whether the broader market rewards that uncompromising vision or whether the commercial realities of alcohol-based production eventually push toward safer ground.

For now, Thebes in Extrait form exists as a fascinating experiment in translation, asking how much of an attar's soul survives the journey from oil to alcohol. The early evidence suggests more than you might expect, though undoubtedly something irretrievable remains bound to the original medium. What emerges is neither superior nor inferior, but genuinely different — a parallel interpretation that extends the concept rather than simply reproducing it in another format.

Sources:

When Architecture Becomes Instrument

Philip Johnson's Glass House served as more than a venue for Ryuichi Sakamoto and Alva Noto's 2016 improvisation — it became the instrument itself. Contact microphones placed on the glass walls captured vibrations, transforming the structure into a resonant body. The resulting album, released in 2018, documents a single 37-minute performance where architectural space and electronic processing merge.

The collaboration marked their first live work together since Sakamoto's cancer diagnosis in 2014. Both artists approached the session with minimal rehearsal, spending only one day preparing before the recording. Sakamoto brought a keyboard and glass singing bowls, while Nicolai contributed his characteristic digital processing. However, the true voice emerged from the building itself.

Yayoi Kusama's installation — Dots Obsession: Alive, Seeking for Eternal Hope — occupied the space during the performance. Sakamoto described looking through the glass walls at the landscape while surrounded by Kusama's dots as "a strange mixture of natural, nature, and artificial things, art." That tension between organic and synthetic pervades the recording. Nicolai's glitches and static rest against Sakamoto's melodic fragments, neither dominating.

The Glass House's transparent walls offered ideal conditions for an experiment in architectural acoustics. Therefore, what emerged was not merely electronic music performed in a space, but music generated from the space itself — a document of place as much as performance.

Sources:

The Deliberate Slowdown: What Anthropic's Development Pace Tells Us About Sonnet 5

I've been watching Anthropic's release cadence closely over the past year, and something has changed. The company that brought us Claude Opus 4.5 in November 2025 has gone conspicuously quiet. No leaks, no benchmarks teased on Twitter, no cryptic blog posts hinting at breakthrough capabilities. Just silence. That silence, however, tells me more about their next model than any press release could.

The industry has trained us to expect a particular rhythm. OpenAI drops a new model every few months, each one incrementally better than the last. Google races to catch up. The smaller labs scramble to carve out niches. We've come to expect this treadmill of marginal improvements, each accompanied by breathless claims of revolutionary progress. Anthropic participated in this race for a while, but I believe they're stepping off it deliberately.

Consider what we know about their philosophy. The company was founded explicitly on the principle that AI safety cannot be an afterthought. Their Constitutional AI approach isn't marketing — it's baked into their training methodology. They've published papers on interpretability that most companies wouldn't touch because they reveal uncomfortable truths about what we don't understand. This isn't a company optimizing for Twitter engagement or shareholder updates.

Therefore, when I look at the gap between Opus 4.5 and whatever comes next, I don't see delay. I see intentionality. I believe Anthropic is rebuilding their development process from the ground up, and the next Sonnet model will reflect that fundamental shift.

The current generation of frontier models, including Anthropic's own, share a common weakness. We can measure their performance on benchmarks, but we struggle to predict their behavior in edge cases. They excel at standard tasks while occasionally producing outputs that reveal concerning blind spots. This unpredictability isn't just an engineering challenge — it's an existential risk that scales with capability. Additionally, the compute required to train these models has grown exponentially, while the improvements have become increasingly incremental.

I suspect Anthropic recognized this pattern and decided to break it. Rather than rush out Sonnet 5 with another ten percent improvement on MMLU, they're likely pursuing something harder. They're probably working on models that can explain their reasoning not as a party trick, but as a core architectural feature. Models that know what they don't know and communicate that uncertainty clearly. Models that scale in safety as aggressively as they scale in capability.

This approach demands patience. You can't bolt interpretability onto a model after training and expect meaningful results. You can't patch constitutional principles into an architecture designed around different priorities. If Anthropic is serious about building models that remain aligned as they grow more powerful, they need to redesign the foundation. That takes time.

The economics support this theory as well. Training runs for frontier models now cost tens of millions of dollars at minimum, likely hundreds of millions for the largest experiments. Companies can sustain that spending if each model clearly surpasses its predecessor and generates corresponding revenue. However, as improvements become marginal, the calculus changes. Anthropic has substantial funding, but they're not infinite. A strategic pause to ensure the next model represents a genuine leap rather than an incremental step makes financial sense.

I also notice that Anthropic has been unusually active in publishing research on model interpretability and mechanistic understanding. These papers don't generate immediate commercial value, but they lay groundwork. They suggest a company thinking several moves ahead, building the theoretical foundation for techniques they plan to deploy at scale. When Sonnet 5 eventually arrives, I expect we'll see these research threads woven throughout its architecture.

The competitive landscape reinforces this reading. OpenAI remains the market leader in terms of mindshare, but their recent releases have felt increasingly similar to each other. Google has made impressive strides with Gemini, but they're playing the same game everyone else is playing — faster, bigger, slightly better on benchmarks. There's an opening for a company willing to compete on a different axis entirely. If Anthropic can deliver a model that's not just capable but genuinely more trustworthy and interpretable, they could define a new category of competition.

Think about what enterprises actually need from these models. They don't need another incremental improvement in code generation or mathematical reasoning. They need models they can deploy with confidence, models whose failure modes they understand, models that integrate into systems with predictable behavior. The company that solves those problems will command premium pricing and customer loyalty that benchmark performance alone cannot buy.

As a result, my prediction for Sonnet 5 is specific. I don't think we'll see a traditional release announcement with the usual fanfare. Instead, I expect Anthropic will publish a detailed technical paper explaining new approaches to alignment and interpretability, followed by a model that demonstrates those approaches in practice. The improvements on standard benchmarks might be modest — perhaps even deliberately restrained. The real advances will be in areas we currently struggle to measure: robustness, predictability, transparency.

The timeline is harder to predict, but I'd be surprised if we see anything before mid-2026. Anthropic's silence suggests they're deep in the experimental phase, not polishing a nearly-ready product. They're likely running training experiments, evaluating results, iterating on architecture. That process can't be rushed without compromising the principles that differentiate them.

This slower pace might frustrate those of us who refresh the Anthropic homepage daily hoping for news. However, I find it reassuring. We've spent the past few years in a headlong sprint toward more capable AI systems, often with safety and interpretability lagging behind. If one major lab is willing to slow down and do the harder work of building systems that scale safely, that benefits everyone.

The race to AGI continues, but perhaps we need some participants racing toward a different finish line. Anthropic appears to be positioning themselves as exactly that. When Sonnet 5 arrives, I believe it will represent not just an incremental improvement, but a statement about what frontier AI development can and should prioritize. The deliberate slowdown isn't weakness — it's the most ambitious move they could make.