In the comment section, it is mentioned to use a binary classification activation function (Sigmoid), but the instructions below the code specify using a ‘Linear’ activation function.
Hi @Mushi maybe you have an outdated version of the Assignment, this is what I see in github:
there should be an option on the top right corner to get the Latest Version!
Thanks, Nevertheless I completed the assignment. I will be cautious next time.
Gent is right that the current version of the assignment doesn’t have that misleading comment. Are you taking the course through your university or some other source besides directly from Coursera? Why would you have the older version?
The larger point is that we always use the from_logits = True
mode for efficiency. That means the sigmoid
(or softmax
) is applied in the loss calculation. Here’s a thread which discusses that. And here’s a thread from mentor Raymond that shows why that approach is better.