Hi @Hussam ,
You are right, we should use categorical_cross_entropy
. At the time of my previous comment the notebook specified that you should use binary_cross_entropy
, later that was revised and changed to what it is now, which makes more sense.
If you are interested, you can check some discussions about it:
That’s because they want you to use the Binary_crossentropy loss function with the from_logits = True argument, which causes the sigmoid calculation to be incorporated in the loss computation. That is preferred because they can manage numerical stability better when they do the two together. E.g. dealing with problems with “saturated” sigmoid output values. Here’s the doc page for Binary_crossentropy .
They don’t really explain what the from_logits = True argument does in the assignment, but the…
Update: Please note that after this thread was created, the Course Staff made some significant updates to this assignment, which include switching to using CategoricalCrossEntropy for the loss function in this section.
Because we specify the from_logits = True argument, that means that the loss logic will apply either sigmoid or softmax to the logits to compute the actual \hat{y} values and then will compute the cross entropy loss between the predictions and the labels:
-y_true * log(sigmoid(y…
4 Likes