Was released exclusively for a few days on OpenAI Plus and Pro accounts. It is now also available through the API.
A Flux.1 [Dev] image of Raquel Gibson, 2005.
September 23, 2025
Was released exclusively for a few days on OpenAI Plus and Pro accounts. It is now also available through the API.
A Flux.1 [Dev] image of Raquel Gibson, 2005.
September 13, 2025
I have switched over to Codex — it’s much cheaper, and for now it seems far more reliable. I’m not running into the problems that have plagued Claude Code over the past month.
I have managed to get Github integration, with Codex loading the appropriate model and permissions. I will probably use Gemini CLI for planning and stick with Codex for a few weeks.
September 09, 2025
This is what developers are essentially being told right now. After nearly a month of frankly appalling performance, Anthropic claims to have identified and resolved the issues. Yet the wording of their statement is so vague and non-specific that it offers little reassurance. It doesn’t explain what went wrong, what was actually fixed, or how developers can expect things to improve going forward. Instead, it leaves us with a cloud of ambiguity — an opaque message that feels more like damage control than genuine clarity.
September 01, 2025
I implemented a balanced guardrail system for the Claude Prompt Builder's
adaptive complexity engine to address the verbosity concerns while maintaining
essential safety checks. The changes have included some important modifying of
the adaptive_prompt_builder.py to scale guardrails appropriately: simple
tasks (≤800 chars) now receive concise core integrity principles and minimal
quality assurance focused on testing, medium tasks (800-2000 chars) get balanced
guidance with a condensed runtime priority wrapper and standard QA including
lint/typecheck requirements, and complex tasks (2000+ chars) retain
comprehensive orchestration with full guardrails. Key improvements involved
creating three tiers of core integrity (minimal/balanced/full), implementing
scaled quality assurance sections (minimal/standard/comprehensive), adding a
concise runtime wrapper for medium complexity, and adjusting verbosity targets
to realistic levels.
The system now ensures that even simple fix typo requests include essential testing reminders without overwhelming users with unnecessary orchestration details, while complex multi-domain tasks still receive the comprehensive guidance they require. Testing confirmed that simple tasks reduced from 2000 to 700 characters while preserving critical safety checks, achieving the goal of appropriate scaling without compromising quality control standards.