Course 3 Week 2 quiz; Why not use softmax

I don’t understand the answer. Didn’t the previous course use softmax as the activation function of the output layer of multi-task classification?
