Course 1 Week 2 A2 np dot leads to nan

Ali_Shehper · July 13, 2023, 3:52pm

In Exercise 5, we are asked to compute cost using np.dot. The most obvious choice (to me) is

{Moderator Edit: Solution Code Removed}

However, this has the potential of giving ‘nan’ or ‘-inf’ values when A has entries that are 0 or 1. How do we avoid this issue?

saifkhanengr · July 13, 2023, 3:57pm

Check the formula again:

J = -\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log(a^{(i)})+(1-y^{(i)})\log(1-a^{(i)}))

You also have to use the sum function.

PS: Posting your code is not allowed. I am deleting it after this reply. Next time, only share your full error.

Ali_Shehper · July 13, 2023, 4:04pm

Sorry for posting the code.

I was able to pass all the tests. The issue I described appears only if I play around with different learning rates. For some learning rates, the cost function gives ‘nan’ values. I suspect it is because with these learning rates, the weights/biases are taking values that are too extreme (in either direction) thus making the output of sigmoid function 0 or 1.

I am wondering if there is a way to implement the cost function so that even in the case above, the cost function does not take ‘nan’ values. instead it takes very large or very small values, giving the gradient descent a chance to tame it in some more iterations.

Thank you.

paulinpaloalto · July 14, 2023, 12:00am

Gradient Descent does not actually depend on the J values themselves, just the derivatives of J w.r.t. the various parameters. So Gradient Descent still works, even if your cost function is throwing NaNs, because your sigmoid values have “saturated”. But you can also add logic to detect and avoid the saturation cases. Here’s a thread which discusses this in more detail.

Topic		Replies	Views
Cost function problem Neural Networks and Deep Learning coursera-platform	19	857	August 16, 2023
Cost = Nan AI Discussions ai-discussions	5	34	January 23, 2025
NAN as results for the cost computations Neural Networks and Deep Learning coursera-platform	27	608	December 27, 2021
W2_A2_Ex6 optimizing log(0) error Neural Networks and Deep Learning coursera-platform	5	321	January 2, 2024
Course 1 Week 3 Neural Networks and Deep Learning coursera-platform	1	587	June 27, 2021

Course 1 Week 2 A2 np dot leads to nan

Related topics