Confusion about the mathematical formula of a1,a2,a3,a4 in softmax regression

When Professor Andrew Ng introduce Softmax Regression, he takes an example that has only 4 possible outputs.

We have a_1=P(y=1|\vec{x}) = \frac{e^{z_1}}{e^{z_1}+e^{z_2}+e^{z_3}+e^{z_4}}, a_2, a_3, a_4 is similar.

But I failed to get the formula by calculating \frac{g(z_1)}{g(z_1)+g(z_2)+g(z_4)+g(z_4)}, g stands for sigmoid function here. So how to compute? I just want to verify the equation.

Hello @nate4,

Check this question out.


1 Like

Thank you! rmwkwok

I learnt that softmax and sigmoid function are two different functions, just taking z into sigmoid(z) and softmax(z) and try to get a equation would never be possible.

But they are similar, for example in binary classification, we have:

softmax(x_1) = \frac{e^{x_1}}{e^{x_1}+e^{x_2}} = \frac{1}{1+e^{x_2-x_1}} = sigmoid(x_1-x_2)

You don’t use softmax for binary classification.

It’s typically only used for with multiple classes, so all of the probabilities sum to 1.

1 Like

Agreed!! :raised_hands: