Problem with the number of classes of mnist dataset

FreyMiggen · June 8, 2023, 3:20pm

The dataset is supposed to have 26 classes denoted from 0 to 24. However, when I check training_labels with np.unique, there are only 24 and the label 9 is missing?
How am I supposed to deal with this issue. Thank you in advance!

nilosreesengupta · June 8, 2023, 3:23pm

Hello @FreyMiggen ,

Send me your notebook via dm such that I can check where it went wrong. By clicking on the profile picture, you will see an option to message. There you can attach your notebook. Then we can discuss the issues here, under the topic you created.

With regards,
Nilosree Sengupta

paulinpaloalto · June 8, 2023, 3:26pm

Well there are 25, right? Since it starts with 0. But it is a good point that 9 is missing, which seems wrong. So maybe that says there are no samples of the 10th letter in the alphabet, which is J, right?

FreyMiggen · June 8, 2023, 3:34pm

Yes sorry for my mistake. The dataset is said to have 25 classes, but only 24 are found. So I think it will likely to cause some problem during training with SparseCategoricalCrossentropy if I set the output class to be 25.

paulinpaloalto · June 8, 2023, 3:38pm

But my point was that it does have 25 classes, right? 0 through 24 is 25 classes. But I would have expected it to have 26 and why are there no instances of class 9 (which I think would be J)?

FreyMiggen · June 8, 2023, 3:43pm

The dataset is said to have 25 classes but it actually only has 24 (from 0 to 24 is 25 -1 (cause 9 is missing) =24).

FreyMiggen · June 8, 2023, 3:45pm

I have already sent my notebook. Have you received it yet? Thank for your help!

nilosreesengupta · June 8, 2023, 5:25pm

Hello @FreyMiggen ,

As @paulinpaloalto Sir, has said it that there is some discrepancy in the dataset that 9th one is missing → So it’s coming 24.

And regarding notebook, I have reviewed it.
The matter regarding which you are concerned, will not lead to an issue. It’s fine. Your code is okay.
You don’t need to fix this. With 24 only, it will work.

Issue is in the parameters and the format that you used, in coding the CNN → under the function : def create_model(), like : Not using activation function in the 2nd dense layer, etc . These leads to error when you train the model.

With regards,
Nilosree Sengupta

TMosh · June 8, 2023, 5:27pm

No letter ‘J’? That seems very odd.

FreyMiggen · June 12, 2023, 7:44am

thank you for your support! the model worked as expected and nothing seems to be off even though there are only 24 classes.

ai_curious · June 12, 2023, 10:24am

I share your intuition that the number of classes represented in the training set, the number of neurons in the output layer, and the number of classes you need to classify in the operational environment should be the same. Your model shouldn’t be predicting output classes it has never trained on, so ideally training classes is not fewer than network output shape and operational classes. The model will likely select the ‘next best’, maybe predicts g if there were no j training inputs. However, it seems a worse problem to have more training classes; wouldn’t that cause a shape mismatch between your Y and \hat{Y} ?

If you train your model with no letter j, can you test what it predicts when you give it one? Just, well, ai_curious

Chris_Barnes · June 19, 2023, 1:19am

FYI this kaggle data set only has 24 classes. ‘J’ and ‘Z’ in sign language require motion so they are not included in the data.

Topic		Replies	Views
C2W4 Class Typo - Needs to be '24', not '26' Convolutional Neural Networks in TensorFlow week-4	3	36	September 10, 2024
C2W4 Assignment has wrong expected outputs and confusing output layer Convolutional Neural Networks in TensorFlow week-4	13	90	August 30, 2024
C2W4 assignment - problems in create_model() Convolutional Neural Networks in TensorFlow week-4	6	460	August 3, 2023
Exercise 4 in lab 1 Introduction to TF for Artificial Intelligence ... week-2	1	520	January 14, 2023
Number of neurons in prediction layer Introduction to TF for Artificial Intelligence ... week-2	2	532	November 15, 2022

Problem with the number of classes of mnist dataset

Related topics