Week2 softmax function

I understand the usage of softmax function is to turn the final output into probablity for each class. But why we don’t use a1=z1/(z1+z2+z3+z4) which can also make the output between 0 and 1? Why we we use e to the power of z instead?

Hi @flyunicorn

Simply using a1= \frac{z_1}{z_1 + z_2 + z_3 + z_4} only works if all z values are positive—if any z_i is negative, the result can be undefined or misleading. The softmax function uses e^{z_i} to keep all values are positive, emphasize larger scores exponentially, and keep outputs between 0 and 1 while their sum is 1.

Hope it helps! Feel free to ask if you need further assistance.

2 Likes