In logistic regression, the professor was writing dw = 1/m(Xdz.T). In week 3, he writes dzA.T. I’m confused, I thought we should have maintained the order of operands since we are dealing with matrices.
1 Like
This is related to how we infer the derivative of W2.
Notice that in logistic regression,dw=(1/m)XdZT,we use T to fit the dimension of w and Z.
WHY A[1] has a T behind?Because in the video,W is a column vector,but W[2] is a row vector,so we need to tranform A[1].
In this way, we adjust them to the right dimension.
1 Like