When Professor Andrew Ng introduce Softmax Regression, he takes an example that has only 4 possible outputs.
We have a_1=P(y=1|\vec{x}) = \frac{e^{z_1}}{e^{z_1}+e^{z_2}+e^{z_3}+e^{z_4}}, a_2, a_3, a_4 is similar.
But I failed to get the formula by calculating \frac{g(z_1)}{g(z_1)+g(z_2)+g(z_4)+g(z_4)}, g stands for sigmoid function here. So how to compute? I just want to verify the equation.
I learnt that softmax and sigmoid function are two different functions, just taking z into sigmoid(z) and softmax(z) and try to get a equation would never be possible.
But they are similar, for example in binary classification, we have: