In week 3, lectures 2 and 3, matrix W is 4x3 and x is 3x1 (actually the exact shape does not matter as long as it is a vector of size 3). Then WT@X where WT is transpose of W and @ represents matrix multiplication does not make sense. There is no way you can get the sizes to line up for matrix multiplication. I see this problem at several places. What am I missing?
Hey @Abhijat.Vatsyayan,
Welcome to the community. The point that you are missing is that WT
has the dimensions (4, 3)
and not W
. If you take a look at the below slide, you will find that in the representation of W^{[1]}, the weight vectors have been inserted after being transposed. So, essentially, W
has the dimensions (3, 4)
, we take it’s transpose WT
, and it has the dimensions (4, 3)
, and now we do the matrix multiplication. Let me know if this helps.
Cheers,
Elemento
Video 3 is a continuation of video 2 hence the confusion - see the following screenshot where W is 4x3 , not W transpose. I can make the dimensions work, its pretty trivial but it looked inconsistent to say the least that is why I asked the question. Will go over the videos again just to be sure.
Hey @Abhijat.Vatsyayan,
I guess when Prof mentions W^{[1]} in any of the slides, he is basically referring to W^T and not really W, as can be seen in the slides that you and me attached. This may indeed be confusing, since, at a few places, there might be some slight abuse of notation, but you will find the notation to be consistent throughout the programming assignments, and pretty much throughout the lecture videos as well. I hope this helps.
Regards,
Elemento
It’d be a good idea to make an error note in the transcript of the video as this can be really confusing to students
Hey @Rodolfo_Novarini,
In the transcript for the video entitled “Computing a Neural Network’s Output”, you can find that whenever Prof Andrew mentions w
as individual vectors, he has mentioned w transpose
. Let me know if at any time-stamp, he has mentioned something else, which I might have missed out on, and I will make the issue accordingly. Thanks for your contributions.
Cheers,
Elemento
Hi @Elemento, I do not believe he has mentioned it differently at any point. I found it surprising that he is calling it a matrix of transposed vectors instead of a transposed matrix. But as I kept moving through the course I understood how this approach makes it simpler going forward. Thanks for your reply.