In this equation , i cannot understand why it takes logarithm of the specific class say for ex . our softmax ouptut is [0.1,0.7,0.2] , then i am confused that if the true class of this example is say 1 , then why we take log(0.1) - assuming indexing starts from 1 and ignore 0.7 and 0.2 while calculating the loss so they aren’t included in the cost function too… , I know i explained very badly , will try to improve

The loss is computed on each example separately, then all of those loss values are summed.

Hi Kavalanche!

In the screenshot, it’s mentioned that only the line corresponding to the target class contributes to the loss. This means that when calculating the cost (or loss), we consider only the output value associated with the correct category.

For example, suppose the model’s output is [0.1, 0.7, 0.2] for three different categories. It represents the “likelihood” of each category. However, the target category is the first one, and the output probability for that category is 0.1. Therefore, we include only the output value for the first category when computing the cost.

This approach ensures that the loss reflects how well the model performs specifically for the correct class, rather than being influenced by other categories.