I was looking at the following Python module - lab_utils_common.py line 74-75.
I could not understand why the cost is calculated in the following way. Can someone explain the following snippet?
if safe: #avoids overflows
cost += -(y[i] * z_i ) + log_1pexp(z_i)
Hi @tamalmallick , good question!
Let me try to show you some details:
- The log_1pexp(z_i) function is used to prevent numerical instability when computing the logarithm of exponentials (check the comment above the code). When z_i is large, it can cause overflow if calculated directly.
- The term -(y[i] * z_i ) represents the contribution of the true value y[i] to the cost.
Hi @carlosrl,
Thank you for your quick reply!
I was trying to understand how cost += -(y[i] * z_i ) + log_1pexp(z_i) fits into the log-loss cross-entropy cost function. If you can provide a little more detail on this, it would be great.