Logistic Regression cost function with rounded off Sigmoid calculations

paulinpaloalto · April 6, 2022, 2:42pm

As you say and for the reasons that Ken explained, it can actually happen that you get exactly 0 or 1 as the output of sigmoid in floating point. In float64 I think it only takes z > 35 to get exactly one. There are a couple of ways you can write the code to detect that has happened and avoid getting NaN or Inf as your cost value. Here’s a fairly recent thread on which we discussed that.

It’s an interesting thought to try random inititalization instead of zero initialization. My guess is that would probably not prevent the eventual saturation, since that’s a property of the shape of the cost surface, but it might change the number of iterations it takes you to get there. But this is just a guess: if you have a case that saturates with zero initialization, try random and see if it makes a difference. Let us know what you discover if you run that experiment!

Perhaps the one other interesting point (which you made in your initial post) is that the progress of back propagation is not impeded by getting a NaN value for J: the actual J value is not really used, other than as an inexpensive to compute proxy for how well your convergence is working. The gradients are expressed as functions and still have valid values even when J is NaN. But you are driving blind about convergence, as you said, so you would need to use another metric, e.g. prediction accuracy, but that is more work to compute. Or use one of the strategies on that other thread to avoid the NaN issue “artificially”.

Topic		Replies	Views
Cost = Nan AI Discussions ai-discussions	5	37	January 23, 2025
Cost function problem Neural Networks and Deep Learning coursera-platform	19	866	August 16, 2023
W2_A2_Ex6 optimizing log(0) error Neural Networks and Deep Learning coursera-platform	5	321	January 2, 2024
Getting -inf and nan as cost value in some iterations Neural Networks and Deep Learning coursera-platform	2	540	July 11, 2021
'Cost at test w,b: nan' trying to apply Breast Cancer Dataset Supervised ML: Regression and Classification week-module-3	19	544	January 23, 2023

Logistic Regression cost function with rounded off Sigmoid calculations

Related topics