U-net assignment: Confused about Y_train dimensions

Hi All,

This is my understanding: The last layer of the U-net has the same (height x width) dimension and the number of channels equals to the number of classes we would like label in the picture, in this case we have 23 classes. Hence the output the U-net is (96, 128, 23), and value in each channel represents probability that a pixel belongs to a certain class .

But shouldn’t Y_train (mask) also have the same dimension? In our assignment, the y_train (or mask)'s dimension is only (96, 128, 1) and the third dimension shows a final label that a pixel belongs to, not probability. Then how does the model calculate loss function?

You convert the y_train values to a “one hot” representation and then use the multidimensional version of “cross entropy” loss.

HI Paul,

I assume the loss=tf.keras.losses.SparseCategoricalCrossentropy performs the one hot encoding for us since I don’t see anywhere in the assignment that we explicitly one hot encode y_train?

There is a documentation page for that function. Have a look!

Oh I get it! Thank you so much!