Dimensions of Weight matrix - change from Week2 to Week3

Yes, Professor Ng is in complete control here and he gets to decide how to represent everything. He uses the convention that any standalone vector is a column vector, which is why the weight vector w in Logistic Regression is an n_x x 1 column vector. That then requires the transpose to get the dot product with X to work as intended.

But when we get to full neural networks, the weights at each layer become a 2D matrix and he chose to orient them so that the transpose is no longer required. He goes into detail about how everything is defined in the lectures in Week 3. Here’s a thread which discusses this in a bit more detail.

1 Like