SparseCategoricalCrossentropy vs. CategoricalCrossentropy

HongruNUS · July 5, 2022, 6:58am

In Machine Learning Specialization taught by Ng. Adrew, most Neural network models for multiclass classification use SparseCategoricalCrossentropy. For example, in a neural network to recognize ten handwritten digits, 0-9, the code is as follows:

model.compile( loss=tf.keras.losses.SparseCategoricalCrossentropy,
optimizer = tf.keras.optimizers.Adam(0.001),
)

I was considering using CategoricalCrossentropy for this handwritten digit recognition example to see the differences in outcomes. However, there is a error that says ‘ValueError: Shapes (None, 1) and (None, 10) are incompatible’

So, is it possible to use the CategoricalCrossentropy loss function in this case? What are the differences between SparseCategoricalCrossentropy and CategoricalCrossentropy in TensorFlow?

Remark: CategoricalCrossEntropy: Expects the target value of an example to be one-hot encoded where the value at the target index is 1 while the other N-1 entries are zero. An example with 10 potential target values, where the target is 2 would be [0,0,1,0,0,0,0,0,0,0].

Elemento · July 5, 2022, 7:12am

Hey @HongruNUS,
Welcome to the community. I guess to answer this question, we don’t need to go further than your own remark, i.e.,

This is pretty much the only difference between the 2 loss functions. In simple words, if you have y in terms of integers

y = [6, 3, 9]

(Considering we have 3 examples) then we use SparseCategoricalCrossentropy, and if you have y in terms of one-hot encoded labels

y = [
   [0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
   [0, 0, 0, 1, 0, 0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
]

then we use CategoricalCrossentropy. You can easily convert your y from one representation to another and depending on your representation, you can use either of these loss functions.

So, in your example, since you are trying to use CategoricalCrossentropy, you must convert your integer labels into one-hot encoded labels first, so it expects the shape of the labels to be (number of examples, number of classes) and what you are passing is (number of examples, 1). I hope this helps.

Regards,
Elemento

HongruNUS · July 6, 2022, 3:45am

Thanks for your reply. I appreciate it.

Topic		Replies	Views
Categorical_crossentropy vs sparse categorical crossentropy Natural Language Processing in TensorFlow week-4	1	504	February 27, 2023
Sparse_categorical_crossentropy v.s. categorical_crossentropy on C2W4 Convolutional Neural Networks in TensorFlow week-4	4	1497	February 28, 2022
C2_W2_SoftMax Lab - question about SparseCategorialCrossentropy or CategoricalCrossEntropy Advanced Learning Algorithms week-2	6	592	July 31, 2022
Error in Assignment C2_W4_Assignment Convolutional Neural Networks in TensorFlow week-3	3	570	December 1, 2022
Can anyone explain? Introduction to TF for Artificial Intelligence ... week-3	3	573	February 11, 2022

SparseCategoricalCrossentropy vs. CategoricalCrossentropy

Related topics