In logistic regression, the professor was writing dw = 1/m(Xdz.T). In week 3, he writes dzA.T. I’m confused, I thought we should have maintained the order of operands since we are dealing with matrices.

1 Like

This is related to how we infer the derivative of W2.

Notice that in logistic regression,dw=(1/m)XdZT,we use T to fit the dimension of w and Z.

WHY A[1] has a T behind?Because in the video,W is a **column vector**,but W[2] is a **row vector**,so we need to tranform A[1].

In this way, we adjust them to the right dimension.

1 Like