Week 3 - UNET_V4 - Training Error - "Shapes (None, 96, 128, 1) and (None, 96, 128, 23) are incompatible"

Hi - I completed the exercise got the 100% grade, but during training I get the error in reference. I am looking at my code and I do not see where the mask image (96,128,1) is split into one channel for each class, which is the output of the last conv lawyer of the model. I tried adding a final (1,1) filter but it did not train correctly. What am I missing?


Please show us the full exception trace that includes the error you are showing in the title. It is a bit hard to believe that you could get that error and still get 100% on the assignment.

Thanks for your response. Is there a way to reset the file and start again? I also agree that this is weird, as everyone would have the same issue and could not find anything. I think I am missing the piece of code that spreads de (:,:,1) mask into (:,:,23).
—Complete Trace as requested----
(TensorSpec(shape=(96, 128, 3), dtype=tf.float32, name=None), TensorSpec(shape=(96, 128, 1), dtype=tf.uint8, name=None))
Epoch 1/5

ValueError Traceback (most recent call last)
Input In [46], in <cell line: 7>()
5 train_dataset = processed_image_ds.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE)
6 print(processed_image_ds.element_spec)
----> 7 model_history = unet.fit(train_dataset, epochs=EPOCHS)

File /usr/local/lib/python3.8/dist-packages/keras/utils/traceback_utils.py:67, in filter_traceback..error_handler(*args, **kwargs)
65 except Exception as e: # pylint: disable=broad-except
66 filtered_tb = _process_traceback_frames(e.traceback)
—> 67 raise e.with_traceback(filtered_tb) from None
68 finally:
69 del filtered_tb

File /tmp/autograph_generated_file2dv34ir7.py:15, in outer_factory..inner_factory..tf__train_function(iterator)
13 try:
14 do_return = True
—> 15 retval
= ag
_.converted_call(ag__.ld(step_function), (ag__.ld(self), ag__.ld(iterator)), None, fscope)
16 except:
17 do_return = False

ValueError: in user code:

File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1051, in train_function  *
    return step_function(self, iterator)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1040, in step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1030, in run_step  **
    outputs = model.train_step(data)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 890, in train_step
    loss = self.compute_loss(x, y, y_pred, sample_weight)
File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 948, in compute_loss
    return self.compiled_loss(
File "/usr/local/lib/python3.8/dist-packages/keras/engine/compile_utils.py", line 201, in __call__
    loss_value = loss_obj(y_t, y_p, sample_weight=sw)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 139, in __call__
    losses = call_fn(y_true, y_pred)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 243, in call  **
    return ag_fn(y_true, y_pred, **self._fn_kwargs)
File "/usr/local/lib/python3.8/dist-packages/keras/losses.py", line 1787, in categorical_crossentropy
    return backend.categorical_crossentropy(
File "/usr/local/lib/python3.8/dist-packages/keras/backend.py", line 5119, in categorical_crossentropy

ValueError: Shapes (None, 96, 128, 1) and (None, 96, 128, 23) are incompatible

Can you please share your training code as well so that it can be correlated to error logs to point out the issue?

Please remember that the Course Honor Code does not allow public sharing of solution code for the assignments. If the mentors need to see your code in order to find a problem, there are private ways to do that using DM (Direct Message) threads.

1 Like

Yes, there is a topic about that on the DLS FAQ Thread (the first topic on the list).

That happens in the last Conv2D layer of your overall U-net model. If you look at the summary of that model, then the output shape should look like this:

conv2d_47 (Conv2D)             (None, 96, 128, 32)  18464       ['concatenate_8[0][0]']          
 conv2d_48 (Conv2D)             (None, 96, 128, 32)  9248        ['conv2d_47[0][0]']              
 conv2d_49 (Conv2D)             (None, 96, 128, 32)  9248        ['conv2d_48[0][0]']              
 conv2d_50 (Conv2D)             (None, 96, 128, 23)  759         ['conv2d_49[0][0]']              
Total params: 8,640,471
Trainable params: 8,640,471
Non-trainable params: 0

But if you don’t see that, then it’s hard to believe your model would pass the grader.


I really appreciate your help. I got a fresh copy and it runs. I found a couple of cells after the assignment sections missing compared to the freshly downloaded version _v2 (following paulinpaloalto guidance). I still do not understand how we can do the backwards propagation when the model outputs 23 channels, and the training has 1 - but I can try to figure it out. Thanks you the help.

The training data has the output y values in “categorical” form, meaning a single number from 0 to 22 to represent the 23 classes. When you have softmax output, it gives you 23 separate values that are the probabilities of each class being the prediction for a given pixel. To compute the loss, you either convert the categorical y values to one hot form first (see tf.one_hot) or use the version of cross entropy loss that supports the labels in categorical form. That’s the SparseCategoricalCrossentropy function.

When you store the label values, it makes sense to save space and keep them in categorical form. Image segmentation is an extreme case, of course, since you have a label value for every single pixel in the input images. It’s easy to convert “on the fly” to one hot form when you are running the training, although as I pointed out there are loss functions that will take care of that for you.