[Week 2] Assignment 2, Exercise 2 : Why should we choose 'linear' output instead of sigmoid output if it's binary classification problem and not linear regression?

Why should we choose ‘linear’ output instead of sigmoid output if it’s binary classification problem and not linear regression for function alpaca_model?

This is the output of the autograder block:
[‘InputLayer’, [(None, 160, 160, 3)], 0]
[‘Sequential’, (None, 160, 160, 3), 0]
[‘TensorFlowOpLayer’, [(None, 160, 160, 3)], 0]
[‘TensorFlowOpLayer’, [(None, 160, 160, 3)], 0]
[‘Functional’, (None, 5, 5, 1280), 2257984]
[‘GlobalAveragePooling2D’, (None, 1280), 0]
[‘Dropout’, (None, 1280), 0, 0.2]
[‘Dense’, (None, 1), 1281, ‘linear’]

and this latest layer ( [‘Dense’, (None, 1), 1281, ‘linear’] ) is giving an error if I choose a sigmoid activation function.
Could anybody explain why?
Thanks in advance!

1 Like

That is because they are using the BinaryCrossentropy Loss function, which includes the sigmoid as part of the loss function (that’s what is requested by the from_logits = True argument). You could have separated the two, but they say that you get better numerical stability if you use the “bundled” implementation. E.g. it makes it easier for them to handle issues with saturation of the sigmoid values.

Have a look at the TF doc page for that function here for more info.

3 Likes