I am trying to pass the C2W4 assignment and got the error when training the model:
File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 2221, in categorical_crossentropy
return backend.categorical_crossentropy(
File "/usr/local/lib/python3.10/dist-packages/keras/src/backend.py", line 5575, in categorical_crossentropy
target.shape.assert_is_compatible_with(output.shape)
ValueError: Shapes (None, 1) and (None, 26) are incompatible
When I created my model like this. I still need a last layer of 26 nodes (or 25 with slight improvement) . So I donâ t understand why a shape (None, 1) is expected
{moderator edit - solution code removed}
I had the correct output in the previous execution:
**Expected Output:**
Images of training generator have shape: (27455, 28, 28, 1)
Labels of training generator have shape: (27455,)
Images of validation generator have shape: (7172, 28, 28, 1)
Labels of validation generator have shape: (7172,)
I canât seem to find this code in the C2W4 assignment you mentioned. You filed this issue under Deep Learning Specialization > Convolutional Neural Networks - is that correct? Can you give us the name of the course and the name of the assignment?
From the error message and expected out you provided, I think it expects the last layer in your code to output a 1-dimensional value, so the commented line for Dense(1, âŚ) seems necessary, but I canât tell what is correct without knowing the actual assignment youâre referring to
Yes it is âMulti-class Classificationâ under DeepLearning.AI TensorFlow Developer Professional Certificate > Convolutional Neural Networks in TensorFlow coursera course
It is true that I havenât any error if I use a 1-dimensional last layer. But as it is a multi-dimensional, it looks not adequate, no?
{moderator edit - solution code removed}
I think the problem is that train_generator is not Multi-class Classification, so I tried to convert to categorical, but execution is too long, and never ends in coolab
Save your model
model = create_model()
tf.keras.utils.to_categorical(train_generator, 26)
Train your model
history = model.fit(train_generator,
epochs=15,
validation_data=validation_generator)
If the model is outputing a 26 class softmax multiclass classifier, then youâre right that it doesnât really make sense to reduce that to one neuron for the output. Maybe the problem is that your labels are in âcategoricalâ form (one element with a value between 0 and 25 inclusive) and youâre just using the wrong loss function. In this kind of multiclass case, you have two choices:
Convert your Y values to âone hotâ form and use CategoricalCrossEntropy as the loss function.
Leave the Y values in categorical form and use SparseCategoricalCrossEntropy as the loss function, which will do the one hot conversion internally.
My guess is that you used CategoricalCrossEntropy but your labels are in categorical form, which is why you get that shape mismatch error.
But Iâm in the same situation as hackyon: I donât know this course, so donât really know the details of what this assignment is doing.
Hi Paul,
Yes I think youâre right about the labels. Labels look already in categorical form as I didnât notice at the beginning of the assignment : The first value is the label (the numeric representation of each letter) , so it explains why an one dimensional layer is expected at the end.
You can do âargmaxâ on the (m, 26) softmax outputs to get the actual class predictions, but note that the loss function is based on softmax so it needs all 26 values in the âprobability distributionâ form for the \hat{Y} values. So the option is not to make your network output a (m,1) tensor, but to make your cost function able to cope with (m,1) labels, right? That was what I was trying to say in my previous post.
Thank you, yes my cost function has to be able to cope with (m,1). The âSparse Categorical Crossentropyâ was the appropriate categorical loss for this purpose, with a last layer (m, 25) softmax resolved the case