Your goal is to detect road signs (stop sign, pedestrian crossing sign, construction ahead sign) and traffic signals (red and green lights) in images. The goal is to recognize which of these objects appear in each image. You plan to use a deep neural network with ReLU units in the hidden layers. For the output layer, which of the following gives you the most appropriate activation function?
Out of the given options, I chose “SoftMax”, but that was incorrect. I am not able to understand the reason.
You tell me, since you need to predict the presence or absence of multiple objects in each image, which activation function allows each output neuron to independently represent the probability of the corresponding object being present in the image?
To me, Sigmoid can be used. The Sigmoid activation function squashes the output of each neuron to the range [0, 1], representing the probability of the corresponding object being present. Each neuron in the output layer can independently represent the probability of its corresponding object being present or absent in the image.
Softmax only re-scales multiple outputs so their sum is 1. But since you want to identify multiple items in each image, the sum of these independent classes won’t necessarily be 1.