Skip to content

Plutonic Rainbows

Sunday Activities

I was having trouble with the suggestions list persisting on my screen even after I selected a suggested prompt, and my cursor kept losing focus on the text area. To solve this, I switched from using onClick to onMouseDown with e.preventDefault(), which prevents the text area from losing focus when I interact with the suggestions. Then, by using a small setTimeout to refocus on the text area, I ensured that the suggestions list disappears as soon as I choose an option, and my cursor remains in the right place to continue typing.

I’ve now built a solid framework for reinforcement learning from human feedback.

  • Feedback Collection: I set up a FastAPI backend with endpoints for submitting feedback, refining prompts, and generating insights. This lets users provide valuable feedback that’s stored in a SQLite database.

  • Data Management: I integrated SQLAlchemy to handle my SQLite database. The system automatically creates a new feedback.db if one doesn’t exist, giving me a clean slate when needed.

  • Training Simulation: I created a script (rlhf_training.py) that retrieves the feedback data, processes it in a dummy training loop, and saves a model checkpoint. This simulates how I could fine-tune my model using the collected human feedback.

  • Model Setup: I ensured my model is loaded with the correct number of labels (to match my feedback ratings) and can seamlessly integrate with both the feedback collection and training processes.

This framework sets the stage for continuous improvement. Now, as I gather more feedback, I can use this data to progressively refine and retrain my model.

Sunday Extras

Some other things happening today:

  • A sample of Rosendo Mateu No 5 Elixir arrived. It's very unique.

  • Made some small adjustments to Flux.1 [Dev] templates.

  • Began reading The King In Yellow by Robert W. Chambers.

  • Listened to the new MPU101 album.

Flux Updates

I updated all my templates that support image generation to include a high-definition option, while retaining the legacy option because it is likely more cost-effective. The new high-resolution setting now outputs images at 1088×1920 pixels, regardless of whether the orientation is portrait or landscape.

For my Prompt Refiner application, I also added an SQL database to log user feedback ratings on prompts. My plan is to eventually incorporate Reinforcement Learning from Human Feedback (RLHF).

Prompt Refiner Updates

I’ve significantly refined my application’s interface and user experience today by introducing Montserrat as the main font, aligning the two columns so both the refined prompt and AI insights start at the same height, enlarging the text areas for more comfortable typing, and adding a loading spinner that appears whenever a request is processing. I also added a subtle highlight animation for updated content, giving the entire workflow a smoother, more polished feel.

Prompt Refiner

I’ve upgraded my application by integrating a transformer-based model for intent classification, which moves beyond the basic, rule-based system I used initially. Now, instead of relying on simple keyword checks, my app calls a smaller, efficient DistilBERT model that can pick up on more nuanced language patterns. This change makes my pipeline more sophisticated and better prepared for future improvements, such as fine-tuning on my own dataset to achieve domain-specific accuracy.

In addition, I’ve tackled the stability and resource issues I faced before by using a smaller model and explicitly setting it to run on the CPU. This reduces the risk of crashes or silent failures. I’ve also maintained my spaCy-based entity extraction and GPT‑4 integration for generating insights, so my app still returns refined prompts and thorough AI responses. Overall, I feel that my setup is now more robust, extensible, and aligned with best practices in modern NLP.

Chat GPT-4.5 Preview

Now available for Pro users, with Plus users gaining access next week. I tested it through the API — it’s impressive but significantly more expensive than other models. Hopefully, the cost will decrease soon.

New Updates

  • I have added wan-i2v templates for file upload and video generation.

  • Open AI have added ten requests a month to Deep Research for Plus Users.

  • Started building Prompt Refiner application.

  • Built application launcher for Flux.1 [Dev] templates.

Veo2 Updates

I discovered that my app was failing to display the generated video because I was incorrectly extracting the video URL from the Fal.ai API response. Initially, my code assumed the video data was inside a property called data (i.e., final_obj.data), but in reality, the final result was returned directly as a plain dictionary in final_obj with the structure {"video": {"url": "..."}}. Once I logged the final API response, I realised I needed to use final_obj directly to extract the video URL. This change fixed the issue, and now the correct URL is passed to the template, allowing the video to display as intended.

wan-i2v

Another image-to-video model, this time wan-i2v which claims to be the next evolution in video generation.

Built upon the mainstream diffusion transformer paradigm, Wan2.1 achieves significant advancements in generative capabilities through a series of innovations, including our novel spatio-temporal variational autoencoder (VAE), scalable pre-training strategies, large-scale data construction, and automated evaluation metrics. These contributions collectively enhance the model’s performance and versatility

Open New Page

Trying to figure out a way to open links in new tabs.