Improved implementation of softmax - Neural network training | Coursera

Hi, my question is why did we change the output layer activation function to linear? we had multiclass classification problem where we had 10 possibility of outputs. how are we going to predict our output label y? and what does the “from_logits = true” does and what does it mean ?


See if you searched a bit about it one of our mentors here @rmwkwok has written a great post about this, check it out: