LoRA - low rank adaption explained in three minutes

Introduction LoRA (Low-Rank Adaptation of LLMs) is a technique that focuses on updating only a small set of low-rank matrices instead of adjusting all the parameters of a deep neural network . This reduces the computational complexity of the training process significantly. LoRA is particularly useful when working with large language models (LLMs) which have a huge amount of parameters that need to be fine-tuned. The Core Concept: Reducing Complexity with Low-Rank Decomposition
Read more →

Understanding the difference between weight decay and L2 regularization

Introduction Machine learning models are powerful tools for solving complex problems, but they can easily become overly complex themselves, leading to overfitting. Regularization techniques help prevent overfitting by imposing constraints on the model’s parameters. One common regularization technique is L2 regularization, also known as weight decay. In this blog post, we’ll explore the big idea behind L2 regularization and weight decay, their equivalence in stochastic gradient descent (SGD), and why weight decay is preferred over L2 regularization in more advanced optimizers like Adam.
Read more →

Semantic segmentation with prototype-based consistency regularization

Semantic segmentation is a complex task for deep neural networks, especially when limited training data is available. Unlike image classification problems such as Imagenet, semantic segmentation requires a class prediction for every individual pixel rather than just an image-level class. This requires a high level of detail and can be difficult to achieve with limited labeled data. Obtaining labeled data for semantic segmentation is challenging, as it requires precise pixel annotation, which is time-consuming for humans.
Read more →

Everything you need to know about stable diffusion

The goal of this article is to get you up to speed on stable diffusion. You will learn the main use cases, how stable diffusion works, debugging options, how to use it to your advantage and how to extend it. I) Main use cases of stable diffusion There are a lot of options of how to use stable diffusion, but here are the four main use cases: Overview of the four main uses cases for stable diffusion.
Read more →

How and why stable diffusion works for text to image generation

Stable diffusion is all the rage in the deep learning community at the moment. It’s trending on Twitter at #stablediffusion and gaining large amounts of attention all over the internet. We’ll take a look into the reasons for all the attention to stable diffusion and more importantly see how it works under the hood by considering the well-written paper “High-resolution image synthesis with latent diffusion models” by Rombach et al which is the foundation of the system.
Read more →