W2 A2 |Possible inaccuracy while doing L R gradient descent implementation

Hi! I just want to point out a possible inaccuracy in the explanation of the implementation of the logistic regression gradient descent algorithm. The step I’ve highlighted in red in the image to calculate the vector dw should be a matrix multiplication between the matrix X (n,m) and the column vector dZ.T (m,1). Hence, to be consistence with the notation used in the video it should be replaced by
dw = 1/m np.dot( X, dz.T ).
Otherwise, implementing the simple multiplication X * dZ.T in python, the broadcasting of the column vector dZ.T would be applied, resulting in a (n,m) matrix instead of a row vector (1,n).

I think the notation can create problems in particular when moving to the implementation in python where the result of np.dot() and * are very different.

Thank you a lot in advance!

Hi @Davide_Cividino,

Thank you for highlighting this. For matrix multiplication, it is the dot product operation. dw =1/m Xdz^T is written in a mathematics expression.

Hi Davide!!
Welcome to our community.
Thanks for sharing your comments with us.

Thanks for the quick reply! Yes yes, just I just think it is a bit misleading wrt a couple of lines above when instead the pseudocode notation is used ( np.dot() ). If I use the slide as a pseudocode reference to implement the code I could get tricked by the two different notations in the same slide. I imagine changing this is a lot of work, just wanted to point it out :slight_smile:

Hi @Davide_Cividino, in order with your comment, I can tell you that in dw=1/mXdZT as you said is a vector (1,n) on the right column. In the left column, dw=x(i)*dz(i) as is into a for loop is the simplest multiplication that is added to dw1 sum.


You just have to be clear about the notational conventions that Prof Ng uses. He will always explicitly use “*” when he’s writing a mathematical expression and he means elementwise multiply. If there is no explicit operator, he always means “dot product”. Here’s a thread which discusses that in more detail.

Of course you also have to be conscious of whether he’s writing math or python. The two are different in many ways. E.g. this:

s(1 - s)

means something completely different in math than it does in python. If you write that in python, it means that s is a function and you are invoking it with the argument 1 - s. That will not end well.

Notice that earlier in that column he writes the mathematical expression and the python expression for the linear activation and the former is:

Z = w^TX + b

Then you see the np.dot when he does you a favor and writes the same thing in python. In the later expression you point out, he does not write the python only the math, as Kin pointed out earlier.