I can’t seem to figure out how to write the equation for dJ/dw in code.
It’s it just dw = (1/m) * X * (A - Y).T
I could this error:
ValueError: operands could not be broadcast together with shapes (2,3) (3,1)
Could someone please point me in the right direction?
The operation between X and (A - Y).T is supposed to be a dot product, not an “elementwise” multiply. The thing to realize is that in Prof Ng’s notation, when he writes two matrices adjacent in a math formula with no explicit operator between them that means the operation is “dot product” style matrix multiply. When he means “elementwise” multiply, he will always use “*”.
Those operations are fundamentally different: they do different things and have different rules.
In numpy, you can express “elementwise” multiply as the operator “*” or by using the np.multiply function. To express the dot product, you use the operator “@” or the function np.dot or np.matmul. Those are all equivalent for our purposes, but Prof Ng always uses np.dot for matrix multiply.
Thanks Paul for that clarification.
Thanks for the information. Just please be aware that Prof Ng in the video indicates Z = np.dot(w.T,X) + b and then indicates dz=1/m X dz.T. So he is not explicitly indicating that the operation is np.dot for dz but he’s indicating it explicitly for Z. I think that’s not consistent. I’d suggest to change that part or comment something before iniciating the exercise.
You just need to understand the notation that Prof Ng uses. i explained it in my earlier reply on this thread. Of course you also need to be aware of when he is writing numpy code and when he is writing math formulas. If you see two operands simply juxtaposed, that can’t be python code, right?