Dropout cost get nan

paulinpaloalto · June 10, 2022, 3:26pm

Hi, Himanshu.

I have not looked at your code, but one point to make is that perfectly correct code can get NaN for the cost with either sigmoid or softmax output if the activation value “saturates” to exactly 0 or 1. You can add some logic to your cost calculations to check for that case and avoid getting NaN. Here’s a thread which discusses that.

On the point about setting the random seed in every iteration, that’s the way they have us do it in the assignments just for ease of grading, but I think it’s a mistake to do that in a “real” system: that is not the intent of dropout. The whole idea is that you want the behavior to be stochastic. If you just wanted a smaller network, you could have used a smaller network. Here’s a thread which discusses that point.

Topic		Replies	Views
Nan in week 3 assignment Neural Networks and Deep Learning coursera-platform	10	693	May 2, 2021
NAN as results for the cost computations Neural Networks and Deep Learning coursera-platform	27	608	December 27, 2021
Nan when trying different learning rates Neural Networks and Deep Learning coursera-platform	13	641	November 12, 2021
Week1 ex2 np.random.seed(1) in forward_propagation_with_dropout Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	770	August 10, 2021
Week 4, Ass. 2, Ex. 1: two_layer_model Neural Networks and Deep Learning coursera-platform	6	613	August 21, 2021

Dropout cost get nan

Related topics