Cost function problem

paulinpaloalto · March 13, 2022, 5:59pm

This is an interesting question! Of course the \hat{y} values are the output of sigmoid, so they can never be exactly 0 or 1 mathematically. But here we are dealing with the finite limitations of floating point representations, not the abstract beauty of \mathbb{R}, so it can actually happen that the values “saturate” and end up rounding to be exactly 0 or 1.

There are several ways to handle that:

You can test your \hat{y} values for exact equality to 0 or 1 and then slightly perturb the values before the cost computation:

A[A == 0.] = 1e-10
A[A == 1.] = 1. - 1e-10

You can also use isinf() and isnan() to replace any saturated values that happen after the fact, although that’s a bit more code since you need to catch the bad values while the cost is still the “loss” in vector form.

loss[np.isnan(loss) | np.isneginf(loss)] = 42.

You could replace the non-numeric values with 0., but the point is those cases represent a big error that should be punished pretty severely by the loss function. Of course the actual J value doesn’t really affect the gradients in any case: the derivatives are calculated separately.

You can look up the documentation for numpy isnan() and isneginf(). There are two cases to worry about:

If the \hat{y} is 1 and the y is 0, then you get 1 * -\infty for the (1 - Y) term, which is -\infty. But if \hat{y} is 1 and the y is 1, then you get 0 * -\infty for the (1 - Y) term and that is NaN (not a number). Of course you have the same cases in the opposite order for the Y term, when you hit \hat{y} = 0.

Topic		Replies	Views
Loss function for logistic regression Neural Networks and Deep Learning	2	605	December 28, 2021
Course 1 Week 2 A2 np dot leads to nan Neural Networks and Deep Learning	3	502	July 14, 2023
Cost = Nan AI Discussions ai-discussions	5	33	January 23, 2025
Problem in Exercise 5 Week2 Neural Networks and Deep Learning	4	514	October 31, 2022
Logistic Regression cost function with rounded off Sigmoid calculations Neural Networks and Deep Learning	5	700	April 6, 2022

Cost function problem

Related topics