Activation Function for Last Layer - Lab Assignment: Neural Networks for Binary Classification


For this week’s lab assignment, exercise 1, why is it necessary when using Keras Sequential model and Dense Layer with a sigmoid activation to construct the network described to specify that the last layer Dense(1, activation='sigmoid') has a sigmoid activation function?

In Keras Sequential model, there are examples very similar to this (as shown below) but the last later doesn’t have an activation function specified. **How can we choose whether (or not) to specify an activation function for a specific layer? **Please let me know.


The key is in how you invoke the loss function. You have two choices: you can explicitly include sigmoid or softmax as the output layer activation (depending on whether it’s a binary or multiclass classification) or you can omit the output activation and use the from_logits = True argument to tell the loss function to do the activation computation along with the loss internally. The two methods are logically equivalent, but the latter is more efficient: less code to write and it gives more accurate results. Here’s a thread which discusses that and explains more about it.

Mind you, I am not a mentor for this particular course, so I don’t know if the assignment here has any requirements for which way you implement it in this particular case. You’ll need to consult the instructions.

1 Like

Hi @Sreeyutha_Ratala

Following to @paulinpaloalto detailed explanation, just like to add that the requirement for this exercise is to have 3 layers with sigmoid as activation function. Please see below:

The neural network you will use in this assignment is shown in the figure below.

  • This has three dense layers with sigmoid activations.
    • Recall that our inputs are pixel values of digit images.
    • Since the images are of size 20×2020×20, this gives us 400400 inputs
1 Like