Vectors are tensors of 1 dimension. But in the course repeatedly Mr Andrew refers to Vecors of “dimension” nx(refer to week 2 videos). Cud someone just clarify this for me?
Prof Ng does not introduce the terminology of tensors until we get to Course 2 Week 3. Here we are dealing with scalars, vectors, matrices and arrays. In numpy, you can also define a vector as a one dimensional array, but Prof Ng chooses not to do that: he defines vectors as 2D arrays, so that they have an orientation: if a vector is n_x x 1, then it is a “column vector” and if it is 1 x m, then it is a “row vector”. We will be doing “dot product” style multiplication between vectors and matrices (2D objects), so it makes a lot more sense to define vectors as having 2 dimensions (one of which happens to have size 1).
Thank you so much. As a followup, I’ve learnt Matrix multiplication from the Mathematical POV. But in the Week 2 of course 1 Prof Ng does Dot product multiplications of 2 Vectors/Matrices. This flew slightly over my head as it didn’t follow the rules of Matrix multiplication that i’ve been taught where u can multiply only if no of columns of matrix 1 = no of rows of matrix 2. Plus the Matrix/Vector terminology has been used interchangeably and it wud be really awesome to know their hierarchy .
Everything Prof Ng shows does follow the standard rules of matrix multiplication that you quote. If you think you have a counterexample, it just means you need to look a bit more closely to make sure you understand what you are seeing. Note that he chooses the convention that any standalone predefined vector is oriented as a column vector. So both the weights w and each input sample vector x in the case of Logistic Regression in Week 2 are column vectors of dimensions n_x x 1, where n_x is the number of input features (elements in each input sample). So when he defines the linear activation as a purely mathematical expression, it is:
z = \displaystyle \sum_{i = 1}^{n_x} w_i x_i + b
Then when he expresses that as a vector computation it becomes:
z = w^T \cdot x + b
You can see why the transpose is required there: it results in the matrix multiply being 1 x n_x dot n_x x 1, which yields a 1 x 1 or scalar result.
The hierarchy is this containment relationship:
scalars \subset vectors \subset matrices \subset arrays
A matrix is an array with 2 dimensions, but you can have arrays with more than 2 dimensions like the input images in the Logistic Regression assignment: they are m x 64 x 64 x 3 arrays with 4 dimensions and then we “flatten” them into 2D matrices. Here’s a thread which talks about how the flattening works.
One other important thing to mention is that Prof Ng also uses “elementwise” multiply operations in some cases. Those usually aren’t talked about that much in Linear Algebra math classes, but when they do they are called Hadamard Products. That is a completely different operation than the normal “dot product” style matrix multiply. Here’s a thread which discusses how to tell the difference. Prof Ng always uses “*” as the operator when he means “elementwise”.
Thank you for the quick and helpful response. The point you make in the second to the last paragraph is still not clear to me. A vector is an array with 1 dimension. but then what is nx? And i feel i’m confusing this with the geometrical vector in some way or is that related at all to what we deal with here?
I am well familiar with the math side of matrices but its just terminology like this that throws me off slightly. So am i ready for this course and if not where can i learn about vectors and numpy from a Data Science POV?
You’re right that from a mathematical point of view, a vector is a 1 dimensional object. But in numpy (or tensorflow) you have two choices: you can represent a vector as a 1 dimensional object or you can represent a vector as a 2 dimensional object in which one dimension has length 1. So if you consider a matrix as a 2D object in numpy, you can represent a vector as a matrix which happens to have one dimension of size 1. So a “column vector” has dimensions n by 1 and a “row” vector has dimensions 1 by n. Prof Ng chooses to use the 2D representation of vectors because it then makes more sense when you do dot product style multiplies between matrices and vectors.
Here are some examples in numpy:
np.random.seed(42)
v = np.random.randn(5)
print(f"type(v) = {type(v)}")
print(f"v.dtype = {v.dtype}")
print(f"v.shape = {v.shape}")
print(f"v = {v}")
w = np.random.randn(5,1)
print(f"w.shape = {w.shape}")
print(f"w = {w}")
w = np.random.randn(1,5)
print(f"w.shape = {w.shape}")
print(f"w = {w}")
Running the above code gives this result:
type(v) = <class 'numpy.ndarray'>
v.dtype = float64
v.shape = (5,)
v = [ 0.49671415 -0.1382643 0.64768854 1.52302986 -0.23415337]
w.shape = (5, 1)
w = [[-0.23413696]
[ 1.57921282]
[ 0.76743473]
[-0.46947439]
[ 0.54256004]]
w.shape = (1, 5)
w = [[-0.46341769 -0.46572975 0.24196227 -1.91328024 -1.72491783]]
Notice how the column and row vectors display differently than the 1D vector. Also notice that there are two layers of brackets to indicate that there are 2 dimensions to the objects in question.