The Indicator function in the softmax cost function

I am having issue to understand the cost function of softmax, I don’t have that much math knowledge, so here is my understanding of the softmax cost function:
This the specific loss function for each class.
The n summation is to check all the classes? so see which one is the true class?
the m summation is to adds up all the possibility to 1?
Then divide by m to get the average cost?

Then what does this indicator function part does in the middle of the formula?

Let’s go from inside outside.

This → image in the denominator makes, for each sample, the predictions of all classes to sum up to one. This is the normalization factor.

This → image, as you know, is the indicator function that does the picking for image. With this in mind, if you further read more on indicator function, I believe you will finally find out what it does.

This → image gives you the picked loss value for example i according to image.

With a picked loss value for sample i, image just sums over all loss values.


1 Like

Why we need to add the N summation in front of the indicator function to calculate the loss value for i example? what it’s adding?

I understand the m summation is adding all the loss of all the class right?

let’s look at this question first:

I supposed you have checked how indicator function works. If not, please do so.

what does the indicator function image gives you when the condition in the curly brackets is not satisfied?

Among the N things that this summation sign → image is summing up, how many of them can satisfy the condition in the curly brackets?