An important consideration in machine learning is the shape of your data and your variables. You are often shifting and transforming data and then combining it. Thus, it is essential to know how to do this and what shortcuts are available.

Let’s start with a tensor with a single dimension:

import torch
test = torch.tensor([1,2,3])

Now assume we have built some machine learning model which takes batches of such single dimensional tensors as input and returns some output.

We want to test it with our example test tensor. To be able to do so, we need to make sure that it has two dimensions as the model expects a batch of such tensors. This can be done using unsqueeze:

test_batch = test.unsqueeze(0)
print(test_batch, test_batch.shape)
tensor([[1, 2, 3]]) torch.Size([1, 3])

Note the extra [] brackets surrounding our tensor now making it 2-dimensional. The input to unsqueeze is the dimension at which we want to insert a new dimension of size 1. As we want to put the tensor in a batch, we add it in the beginning (0).

And here is a shortcut you can take: PyTorch allows to use None for dimensions which should be added. So the above unsqueeze can be rewritten and is equivalent to:

test_batch = test[None,:]
print(test_batch, test_batch.shape)
tensor([[1, 2, 3]]) torch.Size([1, 3])

The : indicates to take all the data at the indicated dimensionality. Trailing : can be omitted in PyTorch, so we could also write this as:

test_batch = test[None]
print(test_batch, test_batch.shape)
tensor([[1, 2, 3]]) torch.Size([1, 3])

This None notation can be used with arbitrary many dimensions, e.g.:

torch.Size([1, 3, 1])

The inverse operation of unsqueeze is called - surprise - squeeze. So if we have a batch of just one item, we can use squeeze to reduce the dimensionality back to the item. Let’s verify:

tensor([1, 2, 3])

You can use squeeze without an input parameter. In that case, all dimensions which are of size 1 will be removed.