Week 2, Exercise 5, computing cost using np.dot

IBP · August 3, 2021, 2:36am

Hi all,
I was super confused as to the instructions to use np.dot to compute the cost. Everything else in this course has been shown in videos, reading or exercises, but for the life of me, I cannot find anything on the computation of J across all examples using np.dot (or the equivalent mathematical formula). The vectorization videos go through everything BUT computing the cost when showing how to avoid the for-loops. I managed to do the exercises anyway with helpful hints. But no one else seems to have the same problem (and I’m guessing there must be others who aren’t experts in matrix operations yet), so I assume I must have missed something! Despite my attempts at a thorough search through the videos and training exercise (although I didn’t rewatch and redo everything), I STILL haven’t found it! If anyone can give me a hint as to where this was explained, I will be most grateful!

Best,
I.B.P

Jasp3r · August 6, 2021, 12:33pm

I have to same problem, cannot figure out why you would have to use np.dot while the given formula’s never show a dotproduct

kilroyis · October 4, 2021, 2:45pm

I wonder if there’s a typo in the comment, where it says “compute cost using np.dot”. As far as I can see, that comment should say “compute cost using np.sum”. (But please take my suggestion with a pinch of salt, since I am similarly finding this fiendishly difficult!)

tmclerran · October 9, 2021, 3:40pm

I struggled with this too, but it turns out there’s a nice explanation for this in the Week 3 coding assignment, where you have to calculate cost again (week 3, exercise 5). They give some really useful tips on how to use np.sum, np.multiply, and np.dot. Here’s a screenshot from it (which doesn’t include any of my own code):

PS - if any moderators read this, please consider putting that bit of explanation in the week 2 coding assignment (and please forgive me for taking a screenshot!).

Mubsi · October 12, 2021, 5:17am

Hi @IBP, @tmclerran, @Jasp3r, @kilroyis

Thank you for bringing this to my attention. The comment is # compute cost using np.dot. Don't use loops for the sum. and after reviewing it I have realised these were supposed to be two separate comments, not the latter depending on the former as it so appears. I’ll have it fixed soon.

Best,
Mubsi

Aayush_Karki · February 6, 2022, 5:34am

I am also confused by this. My function computes correctly if I use p.multiply and then np,sum, but when I use np.dot I can not get my function to compute correctly.

Here is the problem with the np.dot I am having:
cost = −1/𝑚(∑𝑚𝑖=1(𝑦(𝑖)log(𝑎(𝑖))+(1−𝑦(𝑖))log(1−𝑎(𝑖))))
where,
A – activation function of size (1, number of examples)
Y – true “label” vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)

Similarly, A = 𝜎(𝑤𝑇𝑋+𝑏)
where,
w – weights, a numpy array of size (num_px * num_px * 3, 1)
b – bias, a scalar
X – data of size (num_px * num_px * 3, number of examples)

as W is of size ((num_px * num_px * 3, 1) and after transposeing, it will have the size of
(1, num_px * num_px * 3). And after a dot product of W.T with X whose size is
(num_px * num_px * 3, number of example), A will have the size (1, number of example)

So now, coming back to cost function a dot product of A with y does not make sense as their size is not compatible as they are both row matrices.

Am I missing something?

paulinpaloalto · February 6, 2022, 5:45am

If you have two 1 x m vectors and want the dot product, you have to transpose one of them, just as you did in the w^T \cdot X case. But those were column vectors, so you have to think through how the situation changes when the vectors are row vectors.

Ricarda_Vajen · April 26, 2022, 1:49pm

Thanks a lot for this comment @paulinpaloalto !
It was really a blessing to find this solution after being stuck with the excercise for a long time

Sahin_Sev · June 24, 2022, 3:48am

I am not certain whether I should be be grateful or angry. On the upper side of things I had to think about the problem for a long time which made me dive deeper into the np.dot() vs np.sum() operations. I think I wont forget the implications for a certain amount of time.

Billu · September 20, 2022, 11:18am

Still doesn’t work for me. In the Python Basics last practice question, I get the wrong answer no matter what I do with the dot product. I get the correct answer with sum() and multiply.

Tomy · September 20, 2022, 9:59pm

Hello all! I’m getting the following error message:
AssertionError: Wrong values for costs. [5.801545319394553, nan] != [5.80154532, 0.31057104]

I’m getting lost on how is it possible that the costs are correctly calculated, but then the matrix gets appended a “nan” value.

Here’s how my code goes:

{moderator edit - solution code removed}

I believe this is stopping me from going ahead in future exercises and that’s why I’m getting a 0/100.

I feel as if I’m also missing something regarding the w=… and b=…

paulinpaloalto · September 21, 2022, 4:40am

What answers do you get with dot product? One important thing to note is that this thread is about the cost function in Week 2 where we are dealing with 2 dimensional vectors. In the Numpy exercise the vectors only have 1 dimension, so things should be simpler: no transposes required.

paulinpaloalto · September 21, 2022, 5:01am

@Tomy: Your “update parameters” logic is incorrect. Have another look at the math formulas for that. Your code does not agree with what the formula says. Note that the first cost is correct because that is after 0 iterations, meaning that no updates have happened yet. It only goes wrong when you update.

Also just as a general matter, please note that we aren’t supposed to publicly share solution code, even if it is incorrect. If the mentors can’t figure out how to help without seeing your code, we’ll ask to share it in a private way by DM.

Tomy · September 21, 2022, 10:57am

@paulinpaloalto thank you so much for the help, I think I got it right now!

And sorry about the code, didn’t know about that.

Thanks!

Taniya_Shafique · October 9, 2022, 5:28pm

Hi there.
I hope you all are enjoying your long weekend.

I am facing a kind of similar problem and it is in optimize function. it is regarding the dimsionsions mismatch but I dont understand why it is appearing and from where. I could not debug the error in my code. Here is the error.

ValueError Traceback (most recent call last)
in
----> 1 params, grads, costs = optimize(w, b, X, Y, num_iterations=100, learning_rate=0.009, print_cost=False)
2
3 print ("w = " + str(params[“w”]))
4 print ("b = " + str(params[“b”]))
5 print ("dw = " + str(grads[“dw”]))

in optimize(w, b, X, Y, num_iterations, learning_rate, print_cost)
36 # YOUR CODE STARTS HERE
37
—> 38 grads, cost = propagate(w, b, X, Y)
39
40

in propagate(w, b, X, Y)
31
32
—> 33 A = sigmoid( np.dot(w.T , X) + b )
34 print("A= " , A)
35 cost = (-1/m) * np.sum( Y* np.log(A) + (1-Y)* np.log(1-A) )

ValueError: operands could not be broadcast together with shapes (2,3) (2,2)

Thanks already for help.

paulinpaloalto · October 9, 2022, 5:45pm

Print the type and shape of your b value before the call to propagate from optimize. I’ll bet it is a 2 x 2 numpy array, but that may not happen on the first iteration. That is wrong: it should be a scalar, right? So how did that happen? Take a careful look at how you implemented the “update parameters” logic.

Also note that the shape of your w value is also probably wrong. The result of w^T \cdot X should be 1 x m (where m is 3 in this test case). So why does it turn out to be 2 x 3?

Taniya_Shafique · October 9, 2022, 6:54pm

Thanks for your response.
In each iteration b and activation are changing. so for i = 1, b=1.5, A is 1x3, for i =2, b is 2x1 and A is 2x3. in the third iteration b becomes 2x2, where it shows error.

I believe that these variable should not change shape in the loop.

Taniya_Shafique · October 9, 2022, 7:00pm

Thank you.
In update parameters, I was writing b = b+learning rate*dw, instead of db.

fixed it.

paulinpaloalto · October 9, 2022, 7:40pm

Glad to hear you found the error. Also note that the operation there is subtraction, not addition.

Luca_De_Renzo · May 17, 2023, 3:09pm

logprobs = np.dot(np.log(A2), Y.T) + np.dot(np.log(1 - A2), (1 - Y).T)

My question is:
If I were to apply the formula I would have written

logprobs= np.dot(Y.T, np.log(A2)) + np.dot((1-Y).T, np.log(1-A2))

But in this way, it doesn’t work … why?
Maybe something has to do with the parameters’ sequence has been assigned to the compute_cost function?
Many thks and pardon me

Topic		Replies	Views
General Question on Computing Cost using np.dot directly (without multiply) Neural Networks and Deep Learning coursera-platform	3	519	June 24, 2022
Hi In The #Week 3 Assigment i Have tried my code in 2 different method Gettin wrpng in np.dot methos can someone clarify please Neural Networks and Deep Learning how-to-forum , cnn , coursera-platform	10	120	June 15, 2024
W2_A2_Ex-5_Calculating the value of dw and db using np.dot/np.sum Neural Networks and Deep Learning coursera-platform	2	607	March 19, 2023
Course 1 Week 2 Logistic Regression Cost function Neural Networks and Deep Learning coursera-platform	4	719	September 28, 2021
W4_A 2_Ex-1_two_layer_model Neural Networks and Deep Learning coursera-platform	7	618	October 28, 2022

Week 2, Exercise 5, computing cost using np.dot

Related topics