What is the best notation for consistency when talking about vectors and matrices?

:information_source: I am using NumPy notation when discussing dimensions here.

I am getting all turned around when switching between m and n to indicate dimensions of a vector or matrix.

For example, from the beginning of class, we talk about the “length of a training set” being represented by the variable m. I like this because it is consistent if the training set is a dimensional vector (m,) or a 2-dimensional matrix (m,n). But now and again I see an equation explaining a vector with this notation,

x = \sum_{i=0}^{n-1} a_i b_i

Note that a and b are vectors here where n-1 is indicating a Python code implementation where the shared 1-dimension of the vectors is (n,) or (n,1).

Am I missing something nuanced here?

  • I know dimensions in NumPy will look something like, (m, n, ...) where each additional variable is another dimension of the same array.
  • Sometimes vectors are represented with n instead of m in the context of the NumPy dimensions like, (n,) versus (m,).
  • Even in the context of a vector \vec{x}, we access elements by indexing, e.g., x[0] is the first element of \vec{x}, or in other terms, x^{(0)} is the first element of the same vector.

Hi @jesse!

Mind that whatever I say here is not a worldwide standard as I am afraid there is no standard for this thing. The best hope we can have is one website / one Python package follows strictly one standard of it own throughout its documentation.

Let me try to build my answer for a consistent understanding (or a story) from your (and also our) (m, n) notation as the size of a matrix (or a 2D array).

Then let’s turn to vector. We may represent it as a 1D array or a 2D array. Let’s focus on the 2D case first because vector can be a row vector or a column vector, and 2D array can accomodate both. If we follow our notation, a row vector has a shape of (1, n). Building from that row vector, a column vector which is the tranpose of that row vector has a shape of (n, 1).

I think your summation example is a dot product operation, and a dot product is a row vector dot a column vector, and they should have the shape of (1, n) and (n, 1) respectively.

Finally, if we shrink our 2D row-vector of shape (1, n) back to a 1D representation, then the shape of that 1D array will be (n, ).

In the context of a matrix, m and n are just number of rows and columns. In the context of ML data, they are # samples and # features.

In vector, n is the number of elements, but for vector in ML data, when it refers to just one sample, the size of the vector is # features which is again, n.

Would my answer connect everything up in an acceptable way?

1 Like

Thanks @rmwkwok,

I guess I do still have a question or two.

If it is convention for us to describe a 2-D matrix with the dimensions (m, n) where m is the number of rows, isn’t the first dimension of this matrix described by m?

If that is the case, why do we not maintain that notation for a vector, either 1-D or 2-D with either (m,) or (m, 1) and (1, m) respectively since the variable m would represent the first or only dimension in a vector? I’m confused why this is n and not m. Is there a concrete reason for this or is this purely convention?

Hello @jesse, you may find what I am going to say is tricky, but “that’s why the flow of my story is like that”. Again, I don’t think there is a standard for that, and that’s why I can only give you a story with a flow that can connect things up (I am being pretty plain here =P), instead of giving you an authenticated source of definition which I would be very happy to if there was one.

In one extreme (I know it sounds rude), if we have a 30-D array, we are going to run out of alphabets and we might need to give up using m as the size of our first index because alphabet-only notation won’t work anymore.

Therefore, unfortunately, no.

1 Like

I shared a Wikipedia page above. That page combined with your generous feedback has really helped me solidify my thoughts.