Today’s paper: Emerging properties in self-supervised vision transformers by Mathilde Caron et al. Let’s get the dinosaur out of the room: the name DINO refers to self-distillation with no labels. The self-distillation part refers to self-supervised learning in a student-teacher setup as is often seen for distillation. However, the catch is that in contrast to normal distillation setups where a previously trained teacher network is training a student network, here they work without labels and without pre-training the teacher.
Today’s paper: End-to-End object detection with transformers by Carion et al. This is the second paper of the new series Deep Learning Papers visualized and it’s about using a transformer approach (the current state of the art in the domain of speech) to the domain of vision. More specifically, the paper is concerned with object detection and here is the link to the paper of Carion et al. on arxiv.