W
[1]
will have shape (2, 4)
b^{[2]}b
[2]
will have shape (1, 1)
b^{[2]}b
[2]
will have shape (4, 1)
W^{[2]}W
[2]
will have shape (1, 4)
b^{[1]}b
[1]
will have shape (4, 1)
b^{[1]}b
[1]
will have shape (2, 1)
W^{[1]}W
[1]
will have shape (4, 2)
W^{[2]}W
[2]
will have shape (4, 1)
Please refer the question for more clarity.
Please explain what formula is used and explain with all necessary details.
Thank you.
Have a look at
where Prof Andrew Ng explains dimension analysis.
1 Like
But according to the formulation, should it be W.T*X? So actually the “W” should be transposed from W.T? cc: @jonaslalin
It depends on how you define W. Using Prof Andrew Ng’s definition, it should not be transposed.
I see. Thank you for the answer! In the lecture W.T is defined for logistic regression; while W is used for Neural Networks. Is there any reason to not keep the two definitions consistent?
because the w, not W is a column vector so to have a summation we need a transpose here, so that we end up with a row vector. In W, we already have rows that do what they are supposed to do.
2 Likes