Hi, Could anyone explain why the output layer is activated with ‘linear’ function instead of ‘sigmoid’ function.
Because the sigmoid activation is performed as part of the loss calculation. That’s what the from_logits = True argument tells the cost function. Doing it that way is more numerically stable and more efficient (one less call).
But we expect the output to be a binary class and not a logit number, since its an image classification?
The method shown with the loss function is what you do when you are training the network. Either sigmoid or softmax is being applied as part of the loss in training mode, but the actual predictions (the real outputs, not the logits) are also saved and can be accessed. What we are using here is the Keras Model class, which has lots of methods to support the various things you need to do. Model.fit() does the training, but you use Model.predict() when you want the activation outputs. Here’s one of the relevant doc pages.
I tried to generate the prediction for a sample data. However, with .predict() method the result were logits and not classes. What needs to be specified in the .predict() as arguments to get classes?
I guess, after applying the .predict() method, to get the predictions in classes you have to apply the sigmoid function with threshold. I hope this is the way to go about it.