Softmax vs normal probability calculations

Pirzada · November 11, 2022, 11:45am

Hi

Why do we use softmax? and why we cannot use like z1/(z1+z2+z3).
so what is the purpose of exponentials here?

arvyzukai · November 11, 2022, 2:21pm

To answer in simple words, I would say that softmax accounts for “confidence”:

For example, with not so “confident” values:
softmax([1, 2, 3]) # not so confident
results in not so “confident” probabilities:
[0.09, 0.24, 0.67]

in contrast (z1/(z1+z2+z3)) result:
[0.17, 0.33, 0.5]

Now with different values:
softmax([10, 20, 30]) # very confident result:
[0.00000, 0.00005, 0.99995]

in contrast (z1/(z1+z2+z3)) result:
[0.17, 0.33, 0.5]

To add:
There are other ways to account for “confidence” but softmax “works well” with cross-entropy. If you would want to dive deeper, Chapter 6 addresses this question in Ian Goodfellow and Yoshua Bengio and Aaron Courville free book

Vu_Hoang_Ngo · November 13, 2022, 8:20am

Hi,
I found this article, I hope it helps: Softmax Activation Function with Python.

Topic		Replies	Views
Week2 softmax function Advanced Learning Algorithms week-2	1	26	May 17, 2025
Softmax formula Advanced Learning Algorithms week-2	1	491	March 14, 2023
Week 2 - Reason of using exponentials from z to a in Softmax Advanced Learning Algorithms week-2	1	134	May 17, 2024
Purpose of using numerically accurate implementation of softmax Advanced Learning Algorithms week-2	5	380	March 4, 2024
Why softmax is used Neural Networks and Deep Learning coursera-platform	3	574	August 6, 2021

Softmax vs normal probability calculations

Related topics