Logistic_Regression_with_a_Neural_Network_mindset What is T

Feeling rather dumb here. Got two questions that I am stuck on for the last 2 days

image
What is T meant to be here? Does it mean Transpose?

  1. Parameter description reads
    X – data of size (num_px * num_px * 3, number of examples)

I just cant picture this in my mind. Should this not be (number of examples, num_px * num_px * 3)? Or is this why we need to do a transpose?

Yes, the T used as an exponent, e.g. w^T, means transpose. As to the dimensions of X, they could have been done either way, but Prof Ng gets to choose. And he chose features x samples as the orientation of X. So if the number of features is n_x and the number of samples is m, then the dimensions of X will be n_x x m.

The dimension of the weight vector w is n_x x 1. That is also a choice that Prof Ng has made: he uses the convention that any “standalone” vector is oriented as a column vector.

So with all that information, you can now see why the formula for Z is:

Z = w^T \cdot X + b

The operation between w^T and X is a matrix multiply, so the “inner dimensions” must agree and you can see that they do:

1 x n_x dot n_x x m gives us a result that is 1 x m.

Then when we apply the sigmoid, that is done “elementwise” so that A has the same shape as Z.

1 Like

Thank you for the detailed explanation.

“That is also a choice that Prof Ng has made: he uses the convention that any “standalone” vector is oriented as a column vector.”

I suppose this will be clear later. However I am unblocked as of now. Thanks a lot.

1 Like

I’m just telling you that w is a column vector. It didn’t have to be that way, but that is the way that Prof Ng chooses to define it. Wait until next week where the weights become a matrix and there the transpose will no longer be required because of the way Prof Ng defines the matrices.