Dropout cost get nan

print_himanshu · June 10, 2022, 5:33am

I was trying to implement the dropout model in the format of Andrew-NG deep_learning course-1 week-4. Data is used of deep-learning course-2 week-1 regularization assignment

“dropout_project.ipynb” is the main project file where when I ran it after 1500 iterations, the cost become nan
“deep_nn.py” and “dropout_and_regularization.py” are the helper function file.
I had tested my implementation for all the bugs

And I have also one doubt, does the “d” variable change every iteration or fixed constant for every iteration. In my implementation I have kept the value of d1 and d2 to be fixed by recalling np.random.seed(1) at the start of the iteration.

Please someone help me

deep_nn.py (10.2 KB)
dropout_and_regularization.py (6.9 KB)
dropout_project.ipynb (21.4 KB)

Cost Error:

aL min-max analysis after every 100 iterations of different models. The max value of the aL in the dropout model becomes 1 after 700 iteration
After 1500 iteration the min value of aL = 0, max value of aL = 1 which results in cost error and daL zero divide error

Screenshot 2022-06-10 1751441730×649 34 KB

balaji.ambresh · June 10, 2022, 5:40am

Please click my name and message your notebook as an attachment.

balaji.ambresh · June 10, 2022, 6:13am

Hi Himanshu,
Thank you for the notebook.
Please move your original post to general discussions topic since your concern is not directly related to course assignments.
Someone with the bandwidth to help out on your personal project(s) will contact you and look at your code.

I’ll leave you with one tip. You can get away with building a keras sequential model and still use dropout / dense and other layers. Look at this link to construct your model. Don’t worry about backpropagation. It’s implicitly taken care of by tensorflow.

Cheers.

paulinpaloalto · June 10, 2022, 3:26pm

Hi, Himanshu.

I have not looked at your code, but one point to make is that perfectly correct code can get NaN for the cost with either sigmoid or softmax output if the activation value “saturates” to exactly 0 or 1. You can add some logic to your cost calculations to check for that case and avoid getting NaN. Here’s a thread which discusses that.

On the point about setting the random seed in every iteration, that’s the way they have us do it in the assignments just for ease of grading, but I think it’s a mistake to do that in a “real” system: that is not the intent of dropout. The whole idea is that you want the behavior to be stochastic. If you just wanted a smaller network, you could have used a smaller network. Here’s a thread which discusses that point.

print_himanshu · June 11, 2022, 3:24am

Thanks for the help.
Fixing aL solves the problem.

Topic		Replies	Views
Nan in week 3 assignment Neural Networks and Deep Learning coursera-platform	10	693	May 2, 2021
NAN as results for the cost computations Neural Networks and Deep Learning coursera-platform	27	608	December 27, 2021
Nan when trying different learning rates Neural Networks and Deep Learning coursera-platform	13	641	November 12, 2021
Week1 ex2 np.random.seed(1) in forward_propagation_with_dropout Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	770	August 10, 2021
Week 4, Ass. 2, Ex. 1: two_layer_model Neural Networks and Deep Learning coursera-platform	6	613	August 21, 2021

Dropout cost get nan

Related topics