For this quiz question, I am confused about why w is a column vector because I remembered Professor Ng said that w is a row vector in his notes. Could anyone explain this to me? Thank you!

Hello @yshgooole ,

You can get back to the 3rd video of the week to see neural network representation.

Hello @Phuc_Kien_Bui ,

thank you so much for replying to my question, I now see the representation of the vector w, and I understand that if w is a column vector, w.T must be a row vector, but are there any particular reasons why w has to be a column vector? Or is it just a convention to remember?

It is just a convention that Prof Ng has chosen to use in these courses: when he defines standalone vectors, he prefers to format them as column vectors. Likewise with the input sample vectors, which is why you then need to transpose w in order for the dot product w^T \cdot x to work.

It is really a good question. One reason is convention as @paulinpaloalto mention. Another has something to do with math. In case we have a small NN as above, you can assign parameters to W and write down all the calculation with and without vectorization to make sense about it.

Hello, @Phuc_Kien_Bui,

I am watching the third lecture on week 4, and I start to feel confused again, why there is no transpose sign on w? Or is the transpose sign hidden?

Once we get to real neural networks, the W is a matrix, not a vector. Prof Ng explained all this in Week 3. He takes the individual weight vectors, transposes them, and then stacks them as the rows of W so that the matrix multiply works without doing a transpose of the full matrix. The rows are already the transposed w vectors, which give the unique weights for each neuron in the layer.

It is in this Week 3 lecture starting at time offset about 4:00. If you missed that point the first time through, perhaps it’s worth watching again. There also was no transpose in forward propagation in the Planar Data exercise (the Week 3 assignment), right? The only difference here in Week 4 is that we’ve gone to full generality with any number of hidden layers, but what happens at an individual layer is the same as it was in Week 3.

It is a good explanation! Thank you!

I also had a hard time to get through it. Be patient!