Lab_utils_common

I was looking at the following Python module - lab_utils_common.py line 74-75.
I could not understand why the cost is calculated in the following way. Can someone explain the following snippet?

    if safe:  #avoids overflows
        cost += -(y[i] * z_i ) + log_1pexp(z_i)

Hi @tamalmallick , good question!
Let me try to show you some details:

  • The log_1pexp(z_i) function is used to prevent numerical instability when computing the logarithm of exponentials (check the comment above the code). When z_i is large, it can cause overflow if calculated directly.
  • The term -(y[i] * z_i ) represents the contribution of the true value y[i] to the cost.

Hi @carlosrl,

Thank you for your quick reply!

I was trying to understand how cost += -(y[i] * z_i ) + log_1pexp(z_i) fits into the log-loss cross-entropy cost function. If you can provide a little more detail on this, it would be great.

Hello, @tamalmallick,

To help me answer your question for you and for any one who comes to this in the future, let’s use this screenshot that shows a few more lines:

The code can compute cost in two ways where “version 1” is the form taught in the lecture but “version 2” is not. “Version 2” is supposed to be derived from version 1 and in this post you may see how we can derive a different form from version 1.

However, the thing here is that even we can derive a version 2, the one shown in the code is not correct as it is different from my result in the linked post in two ways.

Btw, I did a quick check and found that the function that include this piece of code is not really used by any optional lab. Let me know if I am wrong because it was really just a quick check.

Cheers,
Raymond

The file is from the C2 W1 “optional_labs” folder.