Problem with the number of classes of mnist dataset

The dataset is supposed to have 26 classes denoted from 0 to 24. However, when I check training_labels with np.unique, there are only 24 and the label 9 is missing?
How am I supposed to deal with this issue. Thank you in advance!

Hello @FreyMiggen ,

Send me your notebook via dm such that I can check where it went wrong. By clicking on the profile picture, you will see an option to message. There you can attach your notebook. Then we can discuss the issues here, under the topic you created.

With regards,
Nilosree Sengupta

Well there are 25, right? Since it starts with 0. But it is a good point that 9 is missing, which seems wrong. So maybe that says there are no samples of the 10th letter in the alphabet, which is J, right?

Yes sorry for my mistake. The dataset is said to have 25 classes, but only 24 are found. So I think it will likely to cause some problem during training with SparseCategoricalCrossentropy if I set the output class to be 25.

But my point was that it does have 25 classes, right? 0 through 24 is 25 classes. But I would have expected it to have 26 and why are there no instances of class 9 (which I think would be J)?

The dataset is said to have 25 classes but it actually only has 24 (from 0 to 24 is 25 -1 (cause 9 is missing) =24).

I have already sent my notebook. Have you received it yet? Thank for your help!

Hello @FreyMiggen ,

As @paulinpaloalto Sir, has said it that there is some discrepancy in the dataset that 9th one is missing → So it’s coming 24.

And regarding notebook, I have reviewed it.
The matter regarding which you are concerned, will not lead to an issue. It’s fine. Your code is okay.
You don’t need to fix this. With 24 only, it will work.

Issue is in the parameters and the format that you used, in coding the CNN → under the function : def create_model(), like : Not using activation function in the 2nd dense layer, etc . These leads to error when you train the model.

With regards,
Nilosree Sengupta

No letter ‘J’? That seems very odd.

thank you for your support! the model worked as expected and nothing seems to be off even though there are only 24 classes.

1 Like

I share your intuition that the number of classes represented in the training set, the number of neurons in the output layer, and the number of classes you need to classify in the operational environment should be the same. Your model shouldn’t be predicting output classes it has never trained on, so ideally training classes is not fewer than network output shape and operational classes. The model will likely select the ‘next best’, maybe predicts g if there were no j training inputs. However, it seems a worse problem to have more training classes; wouldn’t that cause a shape mismatch between your Y and \hat{Y} ?

If you train your model with no letter j, can you test what it predicts when you give it one? Just, well, ai_curious :nerd_face:

FYI this kaggle data set only has 24 classes. ‘J’ and ‘Z’ in sign language require motion so they are not included in the data.

1 Like