Hi
Why do we use softmax? and why we cannot use like z1/(z1+z2+z3).
so what is the purpose of exponentials here?
Hi
Why do we use softmax? and why we cannot use like z1/(z1+z2+z3).
so what is the purpose of exponentials here?
Hi @Pirzada
To answer in simple words, I would say that softmax accounts for “confidence”:
For example, with not so “confident” values:
softmax([1, 2, 3]) # not so confident
results in not so “confident” probabilities:
[0.09, 0.24, 0.67]
in contrast (z1/(z1+z2+z3))
result:
[0.17, 0.33, 0.5]
Now with different values:
softmax([10, 20, 30]) # very confident
result:
[0.00000, 0.00005, 0.99995]
in contrast (z1/(z1+z2+z3))
result:
[0.17, 0.33, 0.5]
To add:
There are other ways to account for “confidence” but softmax “works well” with cross-entropy. If you would want to dive deeper, Chapter 6 addresses this question in Ian Goodfellow and Yoshua Bengio and Aaron Courville free book
Hi,
I found this article, I hope it helps: Softmax Activation Function with Python.