Plutonic Rainbows

Plutonic Rainbows

Subresource Integrity

I have added SRI for some files. Subresource Integrity (SRI) is a browser security feature that allows web developers to ensure that resources fetched from external sources — like scripts or stylesheets — haven’t been altered or compromised. It works by including a cryptographic hash in the HTML tag referencing the resource, so when the browser downloads the file, it computes its own hash and compares it to the provided one; if they don’t match, the resource is blocked from loading. This mechanism not only helps protect against potential attacks on third-party content but also boosts overall user trust and website integrity.

Flux Updates

I have added new endpoints to the Flux templates to fully leverage Juggernaut Base Flux LoRA by RunDiffusion. This serves as a drop-in replacement for Flux [Dev], providing sharper details, richer colours, enhanced realism, and complete compatibility with all LoRAs.

I improved font optimisation on my blog by adding a pre-connect link to my fonts, which sped up font loading. I also preloaded the critical font subset to ensure quicker rendering of important content.

ChatGPT 4.5

This long-awaited model is now rolling out to Plus users. I gained access about two hours after the announcement, which I’m quite pleased about. However, I’m concerned about the rate limit, which is reportedly set at 50 requests per week. I assume this restriction is due to the model’s extremely high operational costs. Inevitably, they will reduce in time.

Updates

I've added a function that displays the number of tokens used per query, separated clearly from the text output. Additionally, I compressed the CSS on the blog for improved performance.

Currently testing Tom Ford's Fucking Fabulous Parfum. This is a revitalised edition of the original which launched quite a few years back. Honestly, I really dislike the fragrance's name — I find it controversial simply for controversy's sake.

Sunday Activities

I was having trouble with the suggestions list persisting on my screen even after I selected a suggested prompt, and my cursor kept losing focus on the text area. To solve this, I switched from using onClick to onMouseDown with e.preventDefault(), which prevents the text area from losing focus when I interact with the suggestions. Then, by using a small setTimeout to refocus on the text area, I ensured that the suggestions list disappears as soon as I choose an option, and my cursor remains in the right place to continue typing.

I’ve now built a solid framework for reinforcement learning from human feedback.

  • Feedback Collection: I set up a FastAPI backend with endpoints for submitting feedback, refining prompts, and generating insights. This lets users provide valuable feedback that’s stored in a SQLite database.

  • Data Management: I integrated SQLAlchemy to handle my SQLite database. The system automatically creates a new feedback.db if one doesn’t exist, giving me a clean slate when needed.

  • Training Simulation: I created a script (rlhf_training.py) that retrieves the feedback data, processes it in a dummy training loop, and saves a model checkpoint. This simulates how I could fine-tune my model using the collected human feedback.

  • Model Setup: I ensured my model is loaded with the correct number of labels (to match my feedback ratings) and can seamlessly integrate with both the feedback collection and training processes.

This framework sets the stage for continuous improvement. Now, as I gather more feedback, I can use this data to progressively refine and retrain my model.