Hi, I did assignment week 4 of course 2 and I have some issue with running the model.
I used categorical_crossentropy to calculate loss and it give me an error. However, when I change it to sparse_categorical_crossentropy the model runs with no error. why does it happen? the input data shape are:
Images of training generator have shape: (27455, 28, 28, 1)
Labels of training generator have shape: (27455,)
Images of validation generator have shape: (7172, 28, 28, 1)
Labels of validation generator have shape: (7172,)
Categorical crossentropy expects true labels to be one-hot encoded. Sparse categorical crossentropy expects the true labels to be integers. The advantages of using the sparse version are:
- No need to one-hot encode labels. Use tokenized label index / integer class labels directly.
- Smaller memory footprint since you don’t one-hot encode the labels.
but in the course video and C2_W4_Lab_1_multi_class_classifier assignment Categorical categorical_crossentropy is used and it worked! the only difference between the final assignment and this one is the way we input the data. one used ImageGenerator and the other one from .csv file. does that make difference?
Please read about class_mode
from this link on what it means when it’s set to categorical
.