C2W4 assignment, training model

Thomas_Royer · October 27, 2023, 5:20pm

Hi,

I am trying to pass the C2W4 assignment and got the error when training the model:

    File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 2221, in categorical_crossentropy
        return backend.categorical_crossentropy(
    File "/usr/local/lib/python3.10/dist-packages/keras/src/backend.py", line 5575, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (None, 1) and (None, 26) are incompatible

When I created my model like this. I still need a last layer of 26 nodes (or 25 with slight improvement) . So I don’ t understand why a shape (None, 1) is expected

{moderator edit - solution code removed}

I had the correct output in the previous execution:

**Expected Output:**

Images of training generator have shape: (27455, 28, 28, 1)
Labels of training generator have shape: (27455,)
Images of validation generator have shape: (7172, 28, 28, 1)
Labels of validation generator have shape: (7172,)

I any suggestion… Thank you

hackyon · October 27, 2023, 5:38pm

I can’t seem to find this code in the C2W4 assignment you mentioned. You filed this issue under Deep Learning Specialization > Convolutional Neural Networks - is that correct? Can you give us the name of the course and the name of the assignment?

From the error message and expected out you provided, I think it expects the last layer in your code to output a 1-dimensional value, so the commented line for Dense(1, …) seems necessary, but I can’t tell what is correct without knowing the actual assignment you’re referring to

Thomas_Royer · October 27, 2023, 7:43pm

Yes it is “Multi-class Classification” under DeepLearning.AI TensorFlow Developer Professional Certificate > Convolutional Neural Networks in TensorFlow coursera course

It is true that I haven’t any error if I use a 1-dimensional last layer. But as it is a multi-dimensional, it looks not adequate, no?

{moderator edit - solution code removed}

I think the problem is that train_generator is not Multi-class Classification, so I tried to convert to categorical, but execution is too long, and never ends in coolab

Save your model

model = create_model()
tf.keras.utils.to_categorical(train_generator, 26)

Train your model

history = model.fit(train_generator,
epochs=15,
validation_data=validation_generator)

TMosh · October 27, 2023, 7:49pm

Well, that explains why I could not find this assignment in the DLS courses.

You can use the “pencil” icon in the thread title to move this to the correct forum area.

Once there, you’ll probably get a warning about posting your code on the forum. That’s not allowed by the Code of Conduct.

paulinpaloalto · October 27, 2023, 7:52pm

If the model is outputing a 26 class softmax multiclass classifier, then you’re right that it doesn’t really make sense to reduce that to one neuron for the output. Maybe the problem is that your labels are in “categorical” form (one element with a value between 0 and 25 inclusive) and you’re just using the wrong loss function. In this kind of multiclass case, you have two choices:

Convert your Y values to “one hot” form and use CategoricalCrossEntropy as the loss function.
Leave the Y values in categorical form and use SparseCategoricalCrossEntropy as the loss function, which will do the one hot conversion internally.

My guess is that you used CategoricalCrossEntropy but your labels are in categorical form, which is why you get that shape mismatch error.

But I’m in the same situation as hackyon: I don’t know this course, so don’t really know the details of what this assignment is doing.

Thomas_Royer · October 27, 2023, 8:02pm

Hi Paul,
Yes I think you’re right about the labels. Labels look already in categorical form as I didn’t notice at the beginning of the assignment : The first value is the label (the numeric representation of each letter) , so it explains why an one dimensional layer is expected at the end.

paulinpaloalto · October 27, 2023, 10:59pm

You can do “argmax” on the (m, 26) softmax outputs to get the actual class predictions, but note that the loss function is based on softmax so it needs all 26 values in the “probability distribution” form for the \hat{Y} values. So the option is not to make your network output a (m,1) tensor, but to make your cost function able to cope with (m,1) labels, right? That was what I was trying to say in my previous post.

Thomas_Royer · October 30, 2023, 10:37am

Thank you, yes my cost function has to be able to cope with (m,1). The ‘Sparse Categorical Crossentropy’ was the appropriate categorical loss for this purpose, with a last layer (m, 25) softmax resolved the case

Topic		Replies	Views
C2W4 assignment shape error Convolutional Neural Networks in TensorFlow week-3	2	494	April 13, 2023
Question on C2W4 assignment Convolutional Neural Networks in TensorFlow week-2	5	255	March 21, 2024
Multiclassification C2W4 Assignment - output layer definition Convolutional Neural Networks in TensorFlow week-4	3	360	October 22, 2023
ValueError: Shapes (None, 1) and (None, 26) are incompatible Convolutional Neural Networks in TensorFlow week-4	12	1838	August 29, 2023
C2W4 ValueError: Shapes (None, 1) and (None, 26) are incompatible Convolutional Neural Networks in TensorFlow week-4	21	1445	February 10, 2024

C2W4 assignment, training model

Save your model

Train your model

Related topics