Introduction Machine learning models are powerful tools for solving complex problems, but they can easily become overly complex themselves, leading to overfitting. Regularization techniques help prevent overfitting by imposing constraints on the model’s parameters. One common regularization technique is L2 regularization, also known as weight decay. In this blog post, we’ll explore the big idea behind L2 regularization and weight decay, their equivalence in stochastic gradient descent (SGD), and why weight decay is preferred over L2 regularization in more advanced optimizers like Adam.
Update: Trending on Hacker News, follow the discussion here. I’ve built a small library to build agents which are controlled by large language models (LLMs) which is heavily inspired by langchain. You can find that small library with all the code on Github. The goal was to get a better grasp of how such an agent works and understand it all in very few lines of code. Langchain is great, but it already has a few more files and abstraction layers, so I thought it would be nice to build the most important parts of a simple agent from scratch.
In a previous blog entry, we used langchain to make a Q&A bot out of the content of your website. The Github repository which contains the code of the previous as well as this blog entry can be found here. It was trending on Hacker news on March 22nd and you can check out the disccussion here. This blog posts builds on the previous entry and makes a chatbot which you can interactively ask questions similar to how ChatGPT works.
If you want to learn how to create embeddings of your website and how to use a question answering bot to answer questions which are covered by your website, then you are in the right spot. The Github repository which contains all the code of this blog entry can be found here. It was trending on Hacker news on March 22nd and you can check out the disccussion here. We will approach this goal as follows:
A great site to discover images generated by stable diffusion (or their custom model called aperture) is Lexica.art. Lexica provides an API which can be used to query images matching some keyword / topic. The API returns image URLs, sizes and other things like the prompt used to generate the image and its seed. The goal of this blog post is to visualize the similarity of images from different categories in an interactive plot which can be explored in the browser.