Softmax layer intuition

We know that Softmax usually applied to multi-class classification problem with the formula of e^{a}\over \sum e^{a}.

My question is will a function like a^{2} \over \sum a^{2} mostly work also? If not, why?

(Here a stand for output from last activation.)

Hi @LimXiuXian96 ,

Both Softmax and other are normalizing functions so the input array will be between 0 and 1, summing up to 1. While Softmax is generally used for multi-class classification, alternative functions are subject to research. One needs to test both functions on the data available.

In summary, there isn’t a definitive answer to this question. Below are the results of softmax (left) and other functions (right) for 50 random integers between 0 and 10. What is striking is that there is more noise on the other function.
download download (1)

2 Likes