Well any time you have a shape error, the first question is “what shape is it?” You can add a print statement to find out:
print(f"dw.shape = {dw.shape}")
Then the next question is “How did it get that way?” But I think we can already see the problem. The mathematical formula you are implementing is this:
dw = \displaystyle \frac {1}{m} X \cdot (A - Y)^T
That is not the code that you wrote. You have used an elementwise multiply between X and (A - Y)^T.
Note that the way Prof Ng prefers to write the above formula is this:
dw = \displaystyle \frac {1}{m} X (A - Y)^T
In his preferred notation, writing the two operands adjacent with no explicit operator means that the operation is “real” matrix multiply meaning np.dot
. Here’s a thread which discusses this notation in more detail.