Why Sigmoid is used instead of softmax?

I noticed sigmoid is used in the activation layer of last assignment to classify the output. However in the C3_W3_Lab_2_OxfordPets-UNet Softmax is used. why is that ? both lab and assignment are doing the same segmentation, why we are not using softmax in the C3_W3_Assignment?

2 Likes

I’m not sure, but I think that the softmax is a better solution, the only solution if you have more than 2 classes. But if in the segmentation problem you have only two classes is possible to use a Sigmoid activator.

Sometimes is just a personal preference, for me is better to use always a softmax in a segmentation decoder.

I encourage you to try to replace the last layer in the assignement :slight_smile: and if you do it, share the results with us, please!

Regards.

1 Like

I agree with @Pere_Martra and superficially if you have only 2 outputs shouldn’t make any difference between sigmoid or softmax, but one subtle reason is that these activations are advised to be used with different loss functions sparse_cat_entr vs. binary _entropy. I am guessing this might play a role in here but I have to go more in depth of these assignments before confirming it.