Refactoring machine learning code - namedtuple

Instead of using sometimes confusing indexing in your code, use a namedtuple instead. It’s backwards compatible, so you can still use the index, but you can make your code much more readable. This is especially helpful when you transform between PIL and numpy based code, where PIL uses a column, row notation while numpy uses a row, column notation. Let’s consider this piece of code where we want to get the pixel locations of several points which are in the numpy format:
Read more →

Refactoring machine learning code - einops

Einops is a really great library to improve your machine learning code. It supports Numpy, PyTorch, Tensorflow and many more machine learning libraries. It helps to give more semantic meaning to your code and can also save you a lot of headaches when transforming data. As a primer let’s look at a typical use-case in machine learning where you have a bunch of data and you want to reshape it, so some dimensions are merged together like this:
Read more →

Refactoring machine learning code - comments as code

I find that in the field of data science and machine learning some coding principles that are standard in traditional software engineering sometimes are lacking. One such principle is to strive to rather specify everything that is possible in code rather than as comments. Why does it make sense to do that? Comments often don’t age well. You write them in the context of the current code, but then over time as the code gets changed and readapted to other use cases, the context changes.
Read more →

Swift as a viable Python alternative?

Recently Swift for Tensorflow has picked up some steam, so I wanted to explore the Swift programming language a bit. The main advantage over Python for Swift is that Swift is very fast by directly using the LLVM compiler infrastructure. Python itself relies a lot on C to make code run fast, but if you write Python code you can get very slow code if it’s not optimized. However, the main disadvantage for Swift is that it’s ecosystem when it comes to machine learning and data processing libraries is currently a lot less powerful than Python’s ecosystem.
Read more →

Bash: Keep Script Running - Restart on Crash

When you are prototyping and developing small scripts that you keep running, it might be annoying that they quit when an error occurs. If you want very basic robustness against these crashes, you can at least use a bash script to automatically restart your script on error. The tool to use here is called until and makes this a breeze. Let’s use a dumb example Python script called test.py: import time while True: print('Looping') time.
Read more →
Follow me on twitter!