CNN Week1 2nd Assignment - 'categorical_crossentropy' applied twice?

paulinpaloalto · April 12, 2023, 10:54pm

Categorical cross entropy is not the same thing as softmax. Softmax is the activation function and categorical cross entropy is the loss function that is used when softmax is the activation.

What we do always is not add an explicit softmax (or sigmoid in the binary case) to the output layer and then use the from_logits = True mode of the corresponding cross entropy loss function in order to get better numeric behavior. What that does is tell the loss function to run softmax internally. But that means that when we want to use the trained network in inference (prediction) mode, we manually have to apply softmax to the output to get the prediction values.

Here’s a thread which explains why it is done that way.

Topic		Replies	Views
Where is the activation function in Week 2 - Transfer Learning assignment Convolutional Neural Networks coursera-platform	6	520	July 10, 2022
Math behind "tf.keras.metrics.categorical_crossentropy" Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	907	June 5, 2025
C2 W3 Tensorflow assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	538	October 26, 2022
Categorical_crossentropy vs CategoricalCrossentropy Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	590	June 15, 2021
The choice of loss function and activation function AI for Medical Diagnosis week-module-3	2	588	March 10, 2023

CNN Week1 2nd Assignment - 'categorical_crossentropy' applied twice?

Related topics