For this quiz question, I am confused about why w is a column vector because I remembered Professor Ng said that w is a row vector in his notes. Could anyone explain this to me? Thank you!
Hello @yshgooole ,
You can get back to the 3rd video of the week to see neural network representation.
Hello @Phuc_Kien_Bui ,
thank you so much for replying to my question, I now see the representation of the vector w, and I understand that if w is a column vector, w.T must be a row vector, but are there any particular reasons why w has to be a column vector? Or is it just a convention to remember?
It is just a convention that Prof Ng has chosen to use in these courses: when he defines standalone vectors, he prefers to format them as column vectors. Likewise with the input sample vectors, which is why you then need to transpose w in order for the dot product w^T \cdot x to work.
Hello, @paulinpaloalto,
Thanks for answering my question, and I fully understand this now.
It is really a good question. One reason is convention as @paulinpaloalto mention. Another has something to do with math. In case we have a small NN as above, you can assign parameters to W and write down all the calculation with and without vectorization to make sense about it.
Hello, @Phuc_Kien_Bui,
I am watching the third lecture on week 4, and I start to feel confused again, why there is no transpose sign on w? Or is the transpose sign hidden?
Once we get to real neural networks, the W is a matrix, not a vector. Prof Ng explained all this in Week 3. He takes the individual weight vectors, transposes them, and then stacks them as the rows of W so that the matrix multiply works without doing a transpose of the full matrix. The rows are already the transposed w vectors, which give the unique weights for each neuron in the layer.
It is in this Week 3 lecture starting at time offset about 4:00. If you missed that point the first time through, perhaps it’s worth watching again. There also was no transpose in forward propagation in the Planar Data exercise (the Week 3 assignment), right? The only difference here in Week 4 is that we’ve gone to full generality with any number of hidden layers, but what happens at an individual layer is the same as it was in Week 3.
Thank you so much, @paulinpaloalto , it seems that I need to rewatch the video of week 3 again.
It is a good explanation! Thank you!
I also had a hard time to get through it. Be patient!