Interactive visualization of stable diffusion image embeddings

A great site to discover images generated by stable diffusion (or their custom model called aperture) is Lexica.art. Lexica provides an API which can be used to query images matching some keyword / topic. The API returns image URLs, sizes and other things like the prompt used to generate the image and its seed. The goal of this blog post is to visualize the similarity of images from different categories in an interactive plot which can be explored in the browser.
Read more →

Semantic segmentation with prototype-based consistency regularization

Semantic segmentation is a complex task for deep neural networks, especially when limited training data is available. Unlike image classification problems such as Imagenet, semantic segmentation requires a class prediction for every individual pixel rather than just an image-level class. This requires a high level of detail and can be difficult to achieve with limited labeled data. Obtaining labeled data for semantic segmentation is challenging, as it requires precise pixel annotation, which is time-consuming for humans.
Read more →

Everything you need to know about stable diffusion

The goal of this article is to get you up to speed on stable diffusion. You will learn the main use cases, how stable diffusion works, debugging options, how to use it to your advantage and how to extend it. I) Main use cases of stable diffusion There are a lot of options of how to use stable diffusion, but here are the four main use cases: Overview of the four main uses cases for stable diffusion.
Read more →

How and why stable diffusion works for text to image generation

Stable diffusion is all the rage in the deep learning community at the moment. It’s trending on Twitter at #stablediffusion and gaining large amounts of attention all over the internet. We’ll take a look into the reasons for all the attention to stable diffusion and more importantly see how it works under the hood by considering the well-written paper “High-resolution image synthesis with latent diffusion models” by Rombach et al which is the foundation of the system.
Read more →

Rethinking Depthwise Separable Convolutions in PyTorch

This is a follow-up to my previous post of Depthwise Separable Convolutions in PyTorch. This article is based on the nice CVPR paper titled “Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets” by Haase and Amthor. Previously I took a look at depthwise separable convolutions which are a drop-in replacement for standard convolutions, but focused on computational and parameter-based efficiency. Basically, you can gain similar results with a lot less parameters and FLOPs, so they are used in MobileNet style architectures.
Read more →