Preferred Implementation of Softmax

Since we use linear activation rather than SoftMax activation, the output is not a probability, so what is the output of each neuron now?

It’s a real value between -Infinity and +Infinity.

If you use from_logits = True, then the outputs are converted to probabilities, but it’s behind the door and you can’t see it.

Then we cannot make predictions using the model before converting the output to probabilities using SoftMax. Right?

They are called logits, and logits are strictly increasing with probabilities. If your goal is to make a prediction of which class has the highest probability, you can make the same prediction by getting the class that has the highest logit value. So, in this case, you don’t need to convert logits into probabilities.

1 Like