**Logistic Loss function formula**

As the formula of logistic loss function is as shown above, when our prediction, f_wb is extremely close to 0 or 1, it inevitably results in `np.log(0)`

and returns error code:

`<ipython-input-128-06cade4fd647>:6: RuntimeWarning: divide by zero encountered in log`

`err = -y[i]*np.log(f_wb) - (1-y[i])*np.log(1-f_wb)`

`<ipython-input-128-06cade4fd647>:6: RuntimeWarning: invalid value encountered in multiply`

`err = -y[i]*np.log(f_wb) - (1-y[i])*np.log(1-f_wb)`

What should be done when we come across this dilemma?

It is an interesting question! Of course in pure math terms, the output of sigmoid can never exactly equal 0 or 1. But we are dealing with the pathetic limitations of finite floating point representations here, not the abstract beauty of \mathbb{R}, so this can actually happen. Here’s a thread from DLS which discusses this point.

I’m not familiar with the assignments in MLS, so it’s also possible that I’m missing something here and you should not be hitting this case. But the above link applies in the general case …

Thanks for your prompt reply and it helps a lot!

Well, I didn’t come to this problem when I’m doing assignments in MLS, but when I was messing around with different data by using the methods taught in the course. So your general case explanation is indeed just what I needed. Thank you : )

The issue of `y_pred`

being zero can arise when computing the natural logarithm `log(y_pred)`

. This is because the logarithm is undefined for values less than or equal to zero. To address this issue, a small positive value, called the “epsilon”, is added to `y_pred`

to ensure that the logarithm is well-defined. The value of epsilon is usually a small constant such as `1e-8`

or `1e-16`

.

Here is an updated implementation that adds epsilon to `y_pred`

:

import numpy as np

```
def cross_entropy(y_true, y_pred, epsilon=1e-8):
n = y_true.shape[0]
y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
loss = -np.sum(y_true * np.log(y_pred) + (1-y_true) * np.log(1-y_pred)) / n
return loss
```

where `np.clip(y_pred, epsilon, 1 - epsilon)`

limits the values of `y_pred`

between `epsilon`

and `1 - epsilon`

to avoid the logarithm of zero.