Doubt regarding the notation in Softmax Lab

tinted · August 20, 2023, 11:44am

The above Image is from Softmax Lab from Multiclass classification. In the indicator function or second equation in the image, what does the “n” represent. I am guessing it is value inbetween {1, N}?

Also, in third paragraph, the statement says that “only the line that corresponds to the target contributes to the loss, other lines are zero”, What does this mean, is this in regard to output probabilities where they are either 1 or 0?

The above equation which is part of last equation in the image above is it same as common cross entropy representation shown below

rmwkwok · August 20, 2023, 12:19pm

Hello @tinted

Yes. You can also see it this way:

Look at the indicator function again:

For sample i, it has only one value for its label y^{(i)}, so even though the inner summation goes over from j=1 to j=N, only one of the j's will pass the indicator function, and that is when j = y^{(i)}. Such use of the indicator function is consistent with the definition of the cross-entropy loss

See, even though there are ten logs on the R.H.S., only one will be actually used.

What do you think? What is the definition for P(x) and Q(x)? How do you use them in the context here? You may read this section of the article up to the Remarks.

Cheers,
Raymond

tinted · August 20, 2023, 12:59pm

I am thinking if we consider P(x) as label “y” matrix or theoretically it can be that indicator function as well and then substituting Softmax function directly to cost function (probably because of that “from_logits = True” parameter ) then it should be similar with cross entropy.

I have two more doubts,

how are you typing those notations
I have seen some small negligable value being added to probability distributions so that calculating log(0) wont be a problem, does this small value has anything to do with round off errors for which we use “from_logits=True” or is it entirely just because we are storing and using float values.

rmwkwok · August 20, 2023, 1:28pm

Alright. You described some steps but I am sure you can verify it yourself by actually writing down the steps on a piece of paper. Refer to that link if needed since it’s already a good example of how to argue and do the maths to convert the entropy function to the cross entropy for the logistic regression. I will just leave that to you

$y^{(i)}$

See this for more examples

We know 0 \times \log(0) = 0, however, a computer will complain or return the Not-A-Number token. To avoid that, we add a reasonably small number (e.g. log(0. + 1e-10)) to lower bound it. Btw, why don’t you just try np.log(0.) and 0.*np.log(0.)?

Cheers,
Raymond

Topic		Replies	Views
Loss Function of Week 3 Neural networks topic Neural Networks and Deep Learning coursera-platform	4	647	February 12, 2024
Can someone explain what this means in the Lab - softmax for Week 2 Course 1? Advanced Learning Algorithms week-module-2	7	47	May 15, 2025
The Indicator function in the softmax cost function Advanced Learning Algorithms week-module-2	3	280	March 4, 2024
C2_W2_SoftMax Lab - question about SparseCategorialCrossentropy or CategoricalCrossEntropy Advanced Learning Algorithms week-module-2	6	593	July 31, 2022
Week 1 questions Sequence Models coursera-platform	1	526	December 26, 2021

Doubt regarding the notation in Softmax Lab

Related topics