Here’s the formula for dw:

dw = \displaystyle \frac {1}{m} X \cdot (A - Y)^T

Notice that the operation between X and (A - Y)^T there is a dot product, not an elementwise multiply. Note also that I made a little “enhancement” there to the notation that Prof Ng uses in order to make that more clear. If you used * or np.multiply there, you would have gotten an error, so that’s probably not the problem.

Here’s a thread about how to tell when to use dot product versus elementwise multiply. That thread is also linked from the DLS FAQ Thread, which is a worth a look if you haven’t seen it yet.