I still do not understand the loss function used for the semantic segmentation using U-Net.
Sparse categorical cross entropy receives a volume which is (None, Width, height, num_classes). It does so because this loss function makes use of tf.argmax() to turn it into a one-layer labeled image. Is that correct?
In the example, it is used only accuracy metric.
But what if I want to check IoU, Dice, and so on?
I tried, but it complains that the output of the model has a different shape.