C2W4 assignment, training model

Hi,

I am trying to pass the C2W4 assignment and got the error when training the model:

    File "/usr/local/lib/python3.10/dist-packages/keras/src/losses.py", line 2221, in categorical_crossentropy
        return backend.categorical_crossentropy(
    File "/usr/local/lib/python3.10/dist-packages/keras/src/backend.py", line 5575, in categorical_crossentropy
        target.shape.assert_is_compatible_with(output.shape)

    ValueError: Shapes (None, 1) and (None, 26) are incompatible

When I created my model like this. I still need a last layer of 26 nodes (or 25 with slight improvement) . So I don’ t understand why a shape (None, 1) is expected

# grader-required-cell

def create_model():

  ### START CODE HERE

  # Define the model
  # Use no more than 2 Conv2D and 2 MaxPooling2D
  model = tf.keras.models.Sequential([
    # This is the first convolution
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(2, 2),
    # The second convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dense(26, activation='softmax'),
    #tf.keras.layers.Dense(1, activation='softmax')
])

  # Print the model summary
  model.summary()

  model.compile(optimizer = 'adam',
                loss = 'categorical_crossentropy',
                metrics=['accuracy'])

  ### END CODE HERE

  return model

I had the correct output in the previous execution:

**Expected Output:**

Images of training generator have shape: (27455, 28, 28, 1)
Labels of training generator have shape: (27455,)
Images of validation generator have shape: (7172, 28, 28, 1)
Labels of validation generator have shape: (7172,)

I any suggestion… Thank you

I can’t seem to find this code in the C2W4 assignment you mentioned. You filed this issue under Deep Learning Specialization > Convolutional Neural Networks - is that correct? Can you give us the name of the course and the name of the assignment?

From the error message and expected out you provided, I think it expects the last layer in your code to output a 1-dimensional value, so the commented line for Dense(1, …) seems necessary, but I can’t tell what is correct without knowing the actual assignment you’re referring to

Yes it is “Multi-class Classification” under DeepLearning.AI TensorFlow Developer Professional Certificate > Convolutional Neural Networks in TensorFlow coursera course

It is true that I haven’t any error if I use a 1-dimensional last layer. But as it is a multi-dimensional, it looks not adequate, no?

# grader-required-cell

def create_model():

  ### START CODE HERE

  # Define the model
  # Use no more than 2 Conv2D and 2 MaxPooling2D
  model = tf.keras.models.Sequential([
    # This is the first convolution
    tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D(2, 2),
    # The second convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
    tf.keras.layers.MaxPooling2D(2,2),
    # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(256, activation='relu'),
    tf.keras.layers.Dense(26, activation='softmax'),
    tf.keras.layers.Dense(1, activation='softmax')
])

  # Print the model summary
  # model.summary()

  model.compile(optimizer = 'adam',
                loss = 'categorical_crossentropy',
                metrics=['accuracy'])

  ### END CODE HERE

  return model

I think the problem is that train_generator is not Multi-class Classification, so I tried to convert to categorical, but execution is too long, and never ends in coolab

Save your model

model = create_model()
tf.keras.utils.to_categorical(train_generator, 26)

Train your model

history = model.fit(train_generator,
epochs=15,
validation_data=validation_generator)

Well, that explains why I could not find this assignment in the DLS courses.

You can use the “pencil” icon in the thread title to move this to the correct forum area.

Once there, you’ll probably get a warning about posting your code on the forum. That’s not allowed by the Code of Conduct.

If the model is outputing a 26 class softmax multiclass classifier, then you’re right that it doesn’t really make sense to reduce that to one neuron for the output. Maybe the problem is that your labels are in “categorical” form (one element with a value between 0 and 25 inclusive) and you’re just using the wrong loss function. In this kind of multiclass case, you have two choices:

  1. Convert your Y values to “one hot” form and use CategoricalCrossEntropy as the loss function.
  2. Leave the Y values in categorical form and use SparseCategoricalCrossEntropy as the loss function, which will do the one hot conversion internally.

My guess is that you used CategoricalCrossEntropy but your labels are in categorical form, which is why you get that shape mismatch error.

But I’m in the same situation as hackyon: I don’t know this course, so don’t really know the details of what this assignment is doing.

Hi Paul,
Yes I think you’re right about the labels. Labels look already in categorical form as I didn’t notice at the beginning of the assignment : The first value is the label (the numeric representation of each letter) , so it explains why an one dimensional layer is expected at the end.

You can do “argmax” on the (m, 26) softmax outputs to get the actual class predictions, but note that the loss function is based on softmax so it needs all 26 values in the “probability distribution” form for the \hat{Y} values. So the option is not to make your network output a (m,1) tensor, but to make your cost function able to cope with (m,1) labels, right? That was what I was trying to say in my previous post. :nerd_face:

Thank you, yes my cost function has to be able to cope with (m,1). The ‘Sparse Categorical Crossentropy’ was the appropriate categorical loss for this purpose, with a last layer (m, 25) softmax resolved the case