Multiclass Lab: How is softmax implied in the loss function?

Ankur_Agarwal1 · June 4, 2023, 1:41am

Multiclass Lab says:

This is done by the implied softmax function that is part of the loss function (SparseCategoricalCrossEntropy ). Unlike other activation functions, the softmax works across all the outputs.

pastorsoto · June 4, 2023, 2:01am

Hi @Ankur_Agarwal1 great question!

In a multiclass classification problem, we typically use a softmax function as part of our neural network model. What softmax does is, it takes the raw outputs of the model, known as logits, and transforms them into probabilities.

Now, we want to see how well our model is doing. That’s where the loss function comes in, and in this context, we often use the ‘SparseCategoricalCrossEntropy’ loss function. This function does two things: it applies the softmax function to the logits, and then it computes the categorical cross entropy. So, when we talk about softmax being ‘implied’ in the loss function, it means this function is already doing the softmax operation for us. We don’t need to explicitly apply softmax in our model if we’re using this loss function.

Of course, if you want to include the softmax function in your model, you totally can. In that case, you’d use a different loss function, ‘CategoricalCrossentropy’, which doesn’t have a built-in softmax.

The choice between these two approaches depends on what you’re trying to achieve. If you want to view your model’s outputs as probabilities, then incorporating softmax in your model might be useful. However, if your goal is to maintain numerical stability and efficiency, you might prefer to let the loss function handle the softmax. Either way works, it’s just a matter of what works best for your specific needs.

I hope this helps!

Topic		Replies	Views
Softmax Loss function Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	610	May 7, 2021
Numerical correct implementation of softmax Advanced Learning Algorithms week-module-2	6	616	December 24, 2022
Why softmax in last layer for multiclass NN? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	5	567	January 7, 2022
Why Softmax function? Advanced Learning Algorithms week-module-2	12	229	June 9, 2024
CNN Week1 2nd Assignment - 'categorical_crossentropy' applied twice? Convolutional Neural Networks coursera-platform	3	643	April 13, 2023

Multiclass Lab: How is softmax implied in the loss function?

Related topics