Vote Course 1 Week 2 Problem 6 - optimize

Huby · August 23, 2021, 8:37am

Hello,
i’m stuck since few days on a mistake on the propagate function.
I have finished the 5th exercise without problem, but when i used propagate function in the optimize function i got this mistake

I’m really stuck, i can’t continue with a simple mistake like this one

kenb · August 23, 2021, 3:00pm

First, let’s note the proximate cause. The error is tracing back to your propagate function, which as you note, passed its test. That means that your dot product of w.T and X does not conform for proper matrix multiplication. Given the shapes of the two, you can understand why. the column dimension of w.T does not match the row dimension of X. This is a head -scratcher since those two matrices are set up for you in the test cell. My notebook (which I had downloaded some months ago) sets up a different test. I will look into that.

I will also note that your expression of the cost function is a bit unorthodox–using two (nested) applications of the native Python function sum(). But that OK, it works! I am guessing it is the case that one application of np.sum() would be faster, which would be noticeable in larger applications. (??) And, it you want to stretch yourself, you may try to implement the cost function using applications of the np.dot() function.

Huby · August 23, 2021, 3:17pm

Thank you for your answer,
indeed, at the third iteration w.T’s shape is (2,2) and no (1,2) like before. Do you mean that i have to modify the data before use it? And thank you for the advice, i’m going to change sum by np.sum

kenb · August 23, 2021, 3:37pm

You should not modify anything in a notebook other than the code required to complete the function. I am wondering whether there has been a version change of which I was not aware. I have sent out word to @paulinpaloalto and other mentors to look into that. Stay tuned!

paulinpaloalto · August 23, 2021, 4:11pm

That probably means there is something wrong with your “update parameters” logic. That logic is in optimize, but the error is thrown in propagate because you’ve passed it a w value that is the wrong shape because of the bug in optimize. Print the shape of w before and after the “update” statement in optimize. Why does it change from 1 x 2?

Well, I suppose that the problem could be that your dw is the wrong shape, which would be a bug in propagate. But you don’t mention failing the unit test for your propagate function. The shape of dw should be the same as that of w, right? So that the update statement:

w = w - \alpha * dw

should not change the shape of w.

jonaslalin · August 23, 2021, 5:21pm

Hello!

I gave the search engine a go and found a learner with the same broadcast error:

Error:

paulinpaloalto · August 23, 2021, 5:26pm

That’s a good point! The “cannot broadcast” error is griping about the sum operation, not the dot product. So I would conclude that both w and b are the wrong shapes. If w is n_x x 1 and X is n_x x m then the result of w^T \cdot X should be 1 x m. So how did it end up with a first dimension that is not 1?

jonaslalin · August 23, 2021, 6:02pm

Yes, but not from the beginning. Only one line of code/bug causes this error. love the butterfly effect

If you screw up b, it will find the way into w eventually through dw since it depends on A, which uses b with the wrong shape.

Try

        w = w - learning_rate * dw
        b = b - learning_rate * dw

and

print("w.T.shape:", w.T.shape, "X.shape:", X.shape, "b.shape:", np.array(b).shape)

before np.dot.

You will see the deterioration of shapes happen:

w.T.shape: (1, 2) X.shape: (2, 3) b.shape: ()
w.T.shape: (1, 2) X.shape: (2, 3) b.shape: (2, 1)
w.T.shape: (2, 2) X.shape: (2, 3) b.shape: (2, 2)
BOOM!
ValueError: operands could not be broadcast together with shapes (2,3) (2,2)

Copy pasting, or “copy wasting”, is usually the devil, as I always say

It should be an easy fix @Huby.

jonaslalin · August 23, 2021, 6:12pm

Especially interesting is this case which causes nasty broadcasting of the product and b,
since w.T @ X has shape (1,3) and b has shape (2,1), so broadcasting to (2,3) for A. Nasty!

paulinpaloalto · August 23, 2021, 7:35pm

Eeeek! Nasty, indeed. A great example of the detail oriented nature of programming. Literally a single wrong character leads to a bug that presents confusing symptoms and is hard to track down …

Thanks for providing the details and the cautionary tale!

Topic		Replies	Views
Week 2 Exercise 6 optimize Neural Networks and Deep Learning	3	574	September 28, 2021
Week 2 assignment Neural Networks and Deep Learning	15	2830	August 3, 2023
Course 1 Week 2 Assignment Exercise 6 Neural Networks and Deep Learning	11	437	September 10, 2023
Week 2 Exercise 6 - Propagate passes all tests but raises errors in Optimize Neural Networks and Deep Learning	3	526	January 24, 2022
Week2 Assignment2 Exercise 6 (logistic regression with neural network mindset) Neural Networks and Deep Learning	3	534	March 17, 2023

Vote Course 1 Week 2 Problem 6 - optimize

Related topics