Nan in week 3 assignment

l.huracan · April 18, 2021, 1:24pm

I passed all tests successfully and even checked the correspondence of all numbers.
But for some reason I get nan numbers when I test the model.
I was suspecting a too high learning_rate (I see it nowhere specified by the way) but that happens even if I pass a tiny number as learning_rate.

Any idea?

Cost after iteration 0: 0.692739
Cost after iteration 1000: nan
Cost after iteration 2000: nan
Cost after iteration 3000: nan
Cost after iteration 4000: nan
Cost after iteration 5000: nan
Cost after iteration 6000: nan
Cost after iteration 7000: nan
Cost after iteration 8000: nan
Cost after iteration 9000: nan
W1 = [[nan nan]
 [nan nan]
 [nan nan]
 [nan nan]]
b1 = [[nan]
 [nan]
 [nan]
 [nan]]
W2 = [[nan nan nan nan]]
b2 = [[nan]]
 All tests passed.

yanivh · April 18, 2021, 2:26pm

Try and debug your output for smaller number of iterations (1), to catch when NaNs start. If you are lucky, you’ll get this right from the get go, after the first iteration. This would be easy. I would guess a division by zero. You only need one to pollute everything with NaNs (could be the error at the last layer is corrupted, and then back propagated to ruin everything)

petrifast · April 18, 2021, 7:49pm

Hi @l.huracan,

Congrats on solving most of the problem! The answer from @yanivh gets you in the right direction, here’s a little bit more along those lines.

I would also print out the cost on every iteration.

        if print_cost and i % 1000 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))
#        # Print the cost every 1000 iterations
#         if print_cost and i % 1000 == 0:
#             print ("Cost after iteration %i: %f" %(i, cost))

Run the example with a small number of iterations (say 5).

You can then print the inputs to the compute_cost function by adding before the call to compute_cost:

print ("Iteration %i" %i)
print("A2")
print(A2)
print("cache")
print(cache)

Now look at the inputs to your compute_cost function right before it returns nan. Is there anything wrong with them? If you pass in a nan, you well get nan back.

Try calling compute_cost with these values in a separate cell, do you get nan? Walk through the cost function calculation and see if you have a division by zero or some other problem.

Let us know how it goes. Good luck!

Petri

x

rajsura82 · April 20, 2021, 10:39am

I have similar problem for the Cost function

cost = (1/-m) * np.sum (( Y*np.log(AL) + (1-Y)*np.log(1-AL)), axis=1, keepdims=True )

There is no divide by zero. But log of 0 is undefined.
At what -ve power of e AL value will be deemed zero by log function?

Here is AL that generates NaN

[[9.99999999e-01 1.00000000e+00 1.00000000e+00 1.00000000e+00
1.00000000e+00 9.99998697e-01 9.99689055e-01 9.98987296e-01
1.00000000e+00 1.00000000e+00 1.00000000e+00 1.00000000e+00
1.00000000e+00 1.00000000e+00 9.99997766e-01 1.00000000e+00
1.00000000e+00 1.00000000e+00 9.99975557e-01 1.00000000e+00
1.76802493e-09 3.45822717e-03 4.95418096e-05 3.09661718e-12
3.30659081e-21 1.62800772e-14 3.88169021e-18 2.83461154e-04
3.99641238e-12 9.09844786e-04 2.56082775e-07 1.57937604e-13
3.50683199e-17 2.64660330e-19 6.07843301e-12 6.97355840e-07
9.97489506e-01 1.92262499e-01 1.92262499e-01 3.88829639e-07
1.92262499e-01 1.92262499e-01 1.92262499e-01 1.17367222e-03
1.92262499e-01 9.30968322e-04 1.90474557e-11 1.14305555e-16
7.07184793e-19 2.79591223e-24 8.11220465e-13 6.66724188e-07
1.92212670e-09 1.63457889e-07 2.61283985e-18 3.13871697e-21
1.01108492e-19 1.68017100e-16 2.73019371e-12 2.13968595e-09
1.14859750e-08 1.17481339e-08 1.01151413e-03 8.90329842e-05
2.36488606e-05 5.10787093e-05 1.67056827e-12 5.36222634e-20
3.87110820e-21 4.68735063e-11 9.51776437e-16 2.85277910e-22
1.99680055e-19 1.28592778e-14 2.51287116e-17 5.07699327e-16
3.06743585e-13 1.92262499e-01 1.92262499e-01 5.28983880e-06
1.92262499e-01 1.00000000e+00 1.00000000e+00 9.99999873e-01
1.92262499e-01 1.00000000e+00 1.00000000e+00 1.00000000e+00
1.00000000e+00 1.00000000e+00 1.00000000e+00 1.00000000e+00
9.99845878e-01 1.00000000e+00 9.98268243e-01 9.99626018e-01
1.00000000e+00 9.99650146e-01 9.99999994e-01 9.99999478e-01
9.99998160e-01 9.99999999e-01 1.00000000e+00 1.00000000e+00
1.00000000e+00 1.00000000e+00 1.00000000e+00 9.99999979e-01
9.99980645e-01 1.00000000e+00 9.87470723e-01 9.99999908e-01
9.99999993e-01 1.00000000e+00 1.00000000e+00 1.00000000e+00
9.99796011e-01 1.92262499e-01 3.20185928e-04 1.92262499e-01
1.92262499e-01 9.99997600e-01 1.00000000e+00 1.00000000e+00
1.00000000e+00 1.00000000e+00 9.99999996e-01 1.00000000e+00
9.99999998e-01 1.00000000e+00 1.92262499e-01 1.00000000e+00
1.00000000e+00 1.00000000e+00 9.99906054e-01 1.00000000e+00
1.00000000e+00 1.00000000e+00 9.99999887e-01 9.99999995e-01
9.99251191e-01 1.92262499e-01 9.99999960e-01 1.00000000e+00
1.00000000e+00 9.99999880e-01 1.00000000e+00 1.00000000e+00
9.99999999e-01 9.99998947e-01 9.99159343e-01 1.92262499e-01
6.16672750e-11 1.49495403e-02 2.47824978e-08 1.53590881e-04
1.92262499e-01 5.93266437e-16 1.94438489e-04 1.92262499e-01
2.23802892e-09 8.39215765e-11 1.92262499e-01 5.34975231e-04
2.13053102e-04 1.34732518e-04 1.92262499e-01 3.84013600e-05
1.92262499e-01 1.87175856e-11 2.13609133e-08 1.67417856e-11
1.92262499e-01 6.48063759e-18 1.44794138e-12 3.83071664e-04
2.58898346e-03 3.55035615e-04 2.00564482e-05 1.92262499e-01
9.30148091e-04 6.58495710e-15 1.24295185e-05 2.04945662e-07
5.82963290e-08 2.18362467e-06 1.92262499e-01 8.56487656e-07
4.41973725e-07 1.82210084e-04 7.29905276e-12 1.00000000e+00
1.00000000e+00 9.99630138e-01 1.00000000e+00 1.00000000e+00
1.00000000e+00 1.00000000e+00 1.00000000e+00 9.99999989e-01
1.00000000e+00 9.99996425e-01 1.00000000e+00 9.99706698e-01
1.34293004e-11 9.99477172e-01 9.99334939e-01 9.97441531e-01
1.00000000e+00 1.92262499e-01 1.92262499e-01]]
Cost after iteration 20000: nan

l.huracan · April 20, 2021, 8:59pm

Thank you for your quick responses!

In my case, I oversaw a missing division by m in the backpropagation (it still said “all tests passed”).
Now it works!

kshitijsharma · May 1, 2021, 6:11am

Hi,
I am facing a similar problem. The cost doesn’t change. Why could that be happening? Can anyone explain this to me?

Thanks.

MuhammedHasanKayapin · May 1, 2021, 1:19pm

Hi mate, your model is not updating any parameter so it prints same cost for every iteration case. Hence, check out the updating parameter part. If this part is correct check whether you are implemented it or not for our nn_model case. Normally, what we are expecting is that first we get a big cost value from forward prop then, we would onserve the decreasing cost values after each iteration because back prop updates parameter such that cost function can reach a specific value(converge).

kshitijsharma · May 1, 2021, 1:59pm

Thanks for the reply man.
I am not able to locate any error whatsoever.
I have backpropagated myself to the beginning of my implemented code to update it. I guess my brain needs an upgrade itself. I guess I will write the whole code again maybe.

petrifast · May 2, 2021, 12:13am

Hi @kshitijsharma,

I would investigate the loop in the nn_model further.

If the cost function is not updating, it means the parameters are not being updated.

Start by checking the cost function calculation: does changing parameter values change the cost?

To examine the loop nn_model a little closer:
I would limit the number of iterations to 5 (or something small) and print out the gradients. If the grads terms are zero, the parameter values won’t update. If that’s the case, examine your backpropagation: why is it returning zeros for the grads?

If the grads are non-zero but your parameters are not updating, then your gradient descent step is not working as expected: non-zero grads should lead to parameters updating which means the cost should update.

Let me know what you find.

Good luck!

Best,
Petri

kshitijsharma · May 2, 2021, 2:16am

Hey man @petrifast
I investigated the loop in the nn_model further and the problem was…
I wrote the spelling of parameters as paramters and that is why the parameters weren’t getting updated. Very dumb of me.
The probable cause is sleep deprivation.
Thank you for replying me back.
Kshitij Sharma.

petrifast · May 2, 2021, 2:52am

Congrats @kshitijsharma!

All bugs are obvious… once you see them!

It’s not a silly mistake.

You just debugged your machine learning model, that’s pretty cool and a key skill in being a data scientist.

Keep up the good work.

Best,
Petri

Topic		Replies	Views
NAN as results for the cost computations Neural Networks and Deep Learning coursera-platform	27	608	December 27, 2021
Week -3 NN_Model error , All the block level codes passed Neural Networks and Deep Learning coursera-platform	7	631	June 24, 2021
Nan when trying different learning rates Neural Networks and Deep Learning coursera-platform	13	638	November 12, 2021
Week 4 assignment 2 problems Neural Networks and Deep Learning coursera-platform	5	547	July 5, 2022
One hiden layer- help Neural Networks and Deep Learning coursera-platform	3	540	February 5, 2022

Nan in week 3 assignment

Related topics