I find that in the field of data science and machine learning some coding principles that are standard in traditional software engineering sometimes are lacking. One such principle is to strive to rather specify everything that is possible in code rather than as comments. Why does it make sense to do that? Comments often don’t age well. You write them in the context of the current code, but then over time as the code gets changed and readapted to other use cases, the context changes.
In many neural network architectures like MobileNets, depthwise separable convolutions are used instead of regular convolutions. They have been shown to yield similar performance while being much more efficient in terms of using much less parameters and less floating point operations (FLOPs). Today, we will take a look at the difference of depthwise separable convolutions to standard convolutions and will analyze where the efficiency comes from. Short recap: standard convolution In standard convolutions, we are analyzing an input map of height H and width W comprised of C channels.
Today’s paper: Pyramidal Convolution by Duta et al. This is the third paper of the new series Deep Learning Papers visualized and it’s about using convolutions in a pyramidal style to capture information of different magnifications from an image. The authors show how a pyramidal convolution can be constructed and apply it to several problems in the visual domain. What’s really interesting is that the number of parameters can be kept the same while performance tends to improve.
Every developer needs access to some servers for example to check the application logs. Usually, this is done using public-private key encryption where each developer generates their own public-private key pair. The public keys of each developer are added to the authorized_keys file on each server they should have access to. Painful manual changes So far so good. However, what happens when one developer leaves the company? In that case, the public keys of that developer should be removed from all servers.
Today’s paper: End-to-End object detection with transformers by Carion et al. This is the second paper of the new series Deep Learning Papers visualized and it’s about using a transformer approach (the current state of the art in the domain of speech) to the domain of vision. More specifically, the paper is concerned with object detection and here is the link to the paper of Carion et al. on arxiv.
Follow me on twitter! Follow @mpaepper