Multidimensional Array Terminology

ml
coding
Author

Ken Arnold

Published

January 27, 2023

A surprisingly large amount of the thinking that goes into implementing neural net code is getting the shapes of multidimensional structures right. I’d heard that from others but didn’t really believe it until I had to figure it out myself a couple of times, and that convinced me that everyone could use some guided practice with that. So I give my AI students some exercises in thinking about the shapes of multidimensional structures. We’re working with images because they’re easier to visualize, but the same thing comes up in sequence modeling (batch by sequence length by embedding dimension, sometimes with an attention head dim in there too!).

Students start to explore what broadcasting does (before officially learning how it works), which lets you do cool things like inverting an image by just computing 1 - image.

Axis or Dimension?

Problem: both the PyTorch and NumPy broadcasting docs tend to use the term “dimension”. This is confusing because, e.g., [0.5, 0.25, 0.75] is a vector in 3d space but has just one axis.

I’m guilty of sloppy use of this terminology too, but I suggest we use “number of axes” to refer to len(some_array.shape). This aligns with the NumPy Glossary and the axis= keyword common in NumPy functions. Unfortunately we’ll need to remember that sometimes PyTorch uses “dimension”, e.g,. ‘array.ndim’ and the dim= keyword argument (kwarg) to reduction methods like softmax.

Rank?

To make matters worse, the fast.ai book uses “rank” to refer to the number of axes of a tensor. But “rank” means something different in linear algebra. For example, a length-5 column vector times a length-4 row vector would give a matrix (tensor) with two axes (2-dimensional), with shape (5, 4)

import torch
t1 = torch.ones((5, 1)); t1
t2 = torch.ones((1, 4)); t2
t3 = t1 @ t2; t3
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
t3.shape
torch.Size([5, 4])
t3.ndim
2

But because of how we made it, we know that the matrix has rank 1 in the linear algebra sense, as Torch confirms. (Note that the Torch computation is sensitive to numerical precision issues, though it works correctly in this case.)

torch.linalg.matrix_rank(t3)
tensor(1)

The Exercise

Here’s the exercise, for anyone interested:

Image Operations (show preview, open in Colab)