Hello,
i’m stuck since few days on a mistake on the propagate function.
I have finished the 5th exercise without problem, but when i used propagate function in the optimize function i got this mistake
First, let’s note the proximate cause. The error is tracing back to your propagate function, which as you note, passed its test. That means that your dot product of w.T and X does not conform for proper matrix multiplication. Given the shapes of the two, you can understand why. the column dimension of w.T does not match the row dimension of X. This is a head -scratcher since those two matrices are set up for you in the test cell. My notebook (which I had downloaded some months ago) sets up a different test. I will look into that.
I will also note that your expression of the cost function is a bit unorthodox–using two (nested) applications of the native Python function sum(). But that OK, it works! I am guessing it is the case that one application of np.sum() would be faster, which would be noticeable in larger applications. (??) And, it you want to stretch yourself, you may try to implement the cost function using applications of the np.dot() function.
Thank you for your answer,
indeed, at the third iteration w.T’s shape is (2,2) and no (1,2) like before. Do you mean that i have to modify the data before use it? And thank you for the advice, i’m going to change sum by np.sum
You should not modify anything in a notebook other than the code required to complete the function. I am wondering whether there has been a version change of which I was not aware. I have sent out word to @paulinpaloalto and other mentors to look into that. Stay tuned!
That probably means there is something wrong with your “update parameters” logic. That logic is in optimize, but the error is thrown in propagate because you’ve passed it a w value that is the wrong shape because of the bug in optimize. Print the shape of w before and after the “update” statement in optimize. Why does it change from 1 x 2?
Well, I suppose that the problem could be that your dw is the wrong shape, which would be a bug in propagate. But you don’t mention failing the unit test for your propagate function. The shape of dw should be the same as that of w, right? So that the update statement:
That’s a good point! The “cannot broadcast” error is griping about the sum operation, not the dot product. So I would conclude that bothw and b are the wrong shapes. If w is n_x x 1 and X is n_x x m then the result of w^T \cdot X should be 1 x m. So how did it end up with a first dimension that is not 1?
Especially interesting is this case which causes nasty broadcasting of the product and b,
since w.T @ X has shape (1,3) and b has shape (2,1), so broadcasting to (2,3) for A. Nasty!
Eeeek! Nasty, indeed. A great example of the detail oriented nature of programming. Literally a single wrong character leads to a bug that presents confusing symptoms and is hard to track down …
Thanks for providing the details and the cautionary tale!