Skip to content

Plutonic Rainbows

PDF Generation

I've added a PDF generation feature to the blog that allows readers to download any post as a formatted PDF document. The system uses AWS Lambda with a Python function that converts markdown content to PDF using the ReportLab library. When someone clicks the Download PDF link on a post, their browser sends the post content to an API Gateway endpoint, which triggers the Lambda function to generate and return a properly formatted PDF file. I chose this serverless approach because it keeps costs minimal (typically under $0.02 per month for a personal blog) while providing real-time generation without pre-building PDFs for every post during the blog build process.

The implementation took some iteration to get right. I initially tried using WeasyPrint for PDF generation, but quickly discovered it requires system libraries that aren't available in the Lambda environment, so I switched to ReportLab which is pure Python. The design of the button also evolved through user feedback — starting with a prominent button with an emoji icon, then refining it down to a subtle text link Download PDF that appears inline with the post date, using a minimalist gray color that turns blue on hover. I also had to work through some technical challenges with CORS configuration and binary media type handling in API Gateway to ensure the PDFs download correctly as binary files rather than corrupted base64 text. The end result is a system that generates 2-10 KB PDFs in about 200-500 milliseconds, with proper formatting for headers, lists, code blocks, and other markdown elements.

Agentic Context Engineering

After reading a paper on Agentic Context Engineering, I realized my Claude Prompt Builder had been collecting valuable feedback data without actually learning from it. The paper explored how AI systems can refine themselves by analyzing their own context — and that struck a chord. My system already tracked performance across dozens of tasks, but it lacked a feedback loop. I decided to bridge that gap by introducing a new layer of self-awareness: the Context Evolution Engine — a module designed to analyze historical results and guide smarter prompt decisions.

The engine works quietly and safely. It’s feature-flagged, read-only, and non-disruptive, meaning it observes rather than alters live behavior. By grouping similar tasks through keyword and complexity analysis, it identifies which strategies have historically worked best. When a new task appears, it checks for pattern matches and offers transparent recommendations only if confidence is high. Early analysis of 41 feedback records revealed healthy consistency — no over-engineering and clear success clusters across styling, review, and debugging tasks. Everything remains stable and fully backward compatible, supported by 24 automated tests.

This project reminded me that meaningful improvement doesn’t require sweeping change — it comes from structured evolution. By adding a safe analytical layer, the Prompt Builder now has the foundation to grow intelligently, phase by phase. It’s a cautious but powerful step toward an AI that learns from real-world experience rather than static rules — the essence of agentic context engineering.

Guardrail

I built Guardrail Gateway as an AI safety platform to make interactions with Large Language Models more secure and transparent. It adds a layer of content filtering, policy enforcement, and audit logging between applications and providers such as OpenAI. The system runs on a FastAPI backend with a React frontend, acting as an intelligent proxy that checks every request and response against a set of customizable safety policies before it reaches the model.

The core of the platform is a policy engine that uses regex-based rules with adjustable severity levels and actions like blocking, warning, or redacting content. Right now, I’ve implemented two main policy sets: one for detecting and redacting personally identifiable information, and another for identifying prompt injections or attempts to extract system prompts. Every event is logged for traceability and compliance.

Developers (including myself) can test and tune policies through a web interface, which includes tools for validating configurations, managing policies, and reviewing audit logs. The system uses SQLite for development and PostgreSQL for production, with JWT authentication for secure access and UUID support across databases. Typical requests — from scanning to response logging — complete in about two seconds, with most scans finishing in under 50 ms.

I designed Guardrail Gateway to run quietly in the background, using Python’s asyncio loop on a high port (58001) to minimize interference with other services. It’s written for Python 3.13 and built to scale horizontally thanks to its stateless API design. The frontend, built in React with TypeScript and Vite, includes full documentation for both developers and AI agents.

Search is here

I've finally added a search functionality to my blog, after many months of deliberating over styling and performance impact. After considering various options, I implemented a lightweight client-side search that lets readers quickly find posts by typing keywords into the search box now positioned in the header. The search looks through post titles and content excerpts, highlighting matching terms and displaying up to 10 results in a dropdown. It's nothing revolutionary, but it works well — searches execute in under 10 milliseconds once the index is loaded, and the whole implementation adds just 5KB of JavaScript and CSS to the initial page load.

Search uses a lazy-loading mechanism. Rather than forcing every visitor to download the 143KB search index (containing data for all 629 posts), the index only loads when someone actually clicks or tabs into the search box. This means most visitors who come to read a specific post aren't penalized with extra download time they'll never use. When someone does focus the search input, the index loads in the background while they're typing their query — if they've already entered text by the time it loads, the search runs automatically. It's a simple optimization, but it keeps the blog fast for everyone while still providing instant search for those who need it. The entire search feature added less than half a second to my build time, which felt like a reasonable trade-off for the functionality gained.

Sonnet 4.5

Anthropic release the new model. Pricing remains the same as Claude Sonnet 4, at $3/$15 per million tokens.

Google Gemini

Apologies for the oversight in my previous post — I should have included Google Gemini in the discussion. Gemini is a key player in the current AI landscape, offering a versatile suite of models that combine strong reasoning, coding, and multimodal capabilities. Leaving it out may have given the impression that it isn’t relevant alongside GPT, Claude, Grok, and Qwen, but in reality, it deserves recognition as one of the most significant entrants shaping the competitive field.

Four AI Heavyweights Shaping the Future

Artificial intelligence has become more than just a buzzword — it’s becoming a daily partner in how we work, create, and even play. From writing code to generating ideas and powering conversations, today’s leading models each bring their own personality and strengths. Let’s take a quick look at four of the most exciting names in the space right now: GPT-Codex, Claude, Grok, and Qwen.

GPT-Codex is the coder’s dream assistant. Developed by OpenAI, it bridges natural language and programming, making it possible to describe your goals in plain English and have them turned into functional code. Whether you’re debugging, migrating projects, or building prototypes, Codex feels like an extra teammate who never gets tired of problem-solving.

Claude, from Anthropic, stands out for its thoughtful and safe design. Instead of just pushing raw power, it focuses on clarity, alignment, and long-form reasoning. This makes it an excellent choice for complex projects, structured workflows, and conversations where nuance matters. With Claude Code, developers in particular are finding new ways to work faster while staying organized.

Grok and Qwen represent the new wave of AI challengers. Grok, from xAI, has built its identity around speed, wit, and humor, making interactions more engaging without sacrificing intelligence. Qwen, from Alibaba Cloud, is all about versatility, offering a wide range of model sizes that excel at multilingual tasks, coding, and even image editing. Both are proof that the AI landscape is getting broader and more dynamic every day.

As these models continue to evolve, the takeaway is clear: there’s no single best AI, only the best fit for your goals. Codex shines in coding, Claude thrives in thoughtful reasoning, Grok brings personality to problem-solving, and Qwen pushes the boundaries of scale and adaptability. Together, they highlight an exciting future where we can choose from a diverse toolkit of digital partners — each designed to help us think, create, and build in new ways.

Qwen3-Max

I had a proper look at this today, and I came away really impressed. The whole suite feels fast and intuitive — snappy edits, clean results, and a layout that doesn’t slow you down. What really grabbed my attention, though, was the colorisation ability. It’s not just a gimmick — it handles subtle tones with surprising accuracy, breathing life into black-and-white images without that washed-out, artificial look you sometimes get elsewhere.

Put side by side with Gemini 2.5 Flash Image (Nano Banana), it’s easily on the same level, and in some respects — especially speed, ease of use, and the natural quality of its colorisation — it might even be ahead. It feels less like an alternative and more like a genuine leap forward in what image editing can offer.

GPT-5-Codex

Was released exclusively for a few days on OpenAI Plus and Pro accounts. It is now also available through the API.

A Flux.1 [Dev] image of Raquel Gibson, 2005.

OpenAI Codex

I have switched over to Codex — it’s much cheaper, and for now it seems far more reliable. I’m not running into the problems that have plagued Claude Code over the past month.

I have managed to get Github integration, with Codex loading the appropriate model and permissions. I will probably use Gemini CLI for planning and stick with Codex for a few weeks.