Hello,
I requesting help for making the loss function in DLS Course 1 week 2 exercise 5, under the “def propagate:” function. The assignment asks the user to compute the Activation (A) and to use np.dot() to compute the cost. They gave us the cost function here as a hint, see below.
However, no matter how I am able to code it, I always either calculate or an incorrect cost or receive an error. For example, my line of code for the cost function is, at least in my eyes, EXACTLY what was requested of the user yet still results in an error.
Can anyone help me on this? If this line of code is not exactly what was requested of the user in completing the assignment, then are there directions that I am missing?
Thank you ahead in advanced!
J
Hi @Joseph_Girsch ,
How about this:
The formula is showing a SUM from 1 to m, so it is traversing each element of y and a, right?
Is your formula traversing each element of y and a? or is it doing vectorized operations that consider the entire Y and A at once, in which case the sum would not be needed?
That’s one of the hints I’d share with you.
Once you check that part, you may find that it still doesn’t work. And that has to do with the shapes of Y and A: You may have to do something about the shape of A.
Try this out and let me know how it goes.
Juan
1 Like
To make Juan’s hint one step more specific, note that you have to understand the definition of dot product style matrix multiply. In the case here, both Y and A are vectors of shape 1 x m, where m is the number of samples. So the way you wrote the code with np.dot
, you are doing the dot product of 1 x m by 1 x m, but that doesn’t work, right? It throws an error because the “inner” dimensions don’t match. But notice that if you dotted 1 x m with m X 1, then the result would be 1 x 1 or a scalar. That’s what Juan meant by “doing something” with the shape of A and then you wouldn’t need the sum, once you get the dot product right.
Of course there is an alternative way that you could get the sum of the products of the corresponding elements of two 1 x m vectors: you could use “elementwise” multiply, which is *
or np.multiply
, followed by np.sum
to add up the products.
1 Like
I fixed it. So apparently I need to get better at reading directions:
“”"
compute activation
# A = ...
# compute cost **by using np.dot** to perform multiplication.
# And don't use loops for the sum.
# cost = ...
“”"
The directions explicitly say to use the dot product in the Activation function (which I did not do) and not the cost function (which I did use it). Once I flipped that around the function worked like a charm!
Previously I incorrectly had:
A = sigmoid((w.T*X) + b)
cost = (-1/m)*np.sum(np.dot(Y,np.log(A)) + np.dot((1-Y),np.log(1-A)))
With the correct multiplcation assigned:
A = sigmoid(np.dot(w.T,X) + b)
cost = (-1/m)*np.sum(Y*np.log(A) + (1-Y)*np.log(1-A))
Thank you so much for everyone’s help!