This is a mapping to the desired number of classes by means of a 1x1 convolution with the number of filters (channels) equal to the desired number of classes. Put differently, the 1x1 convolution maps the input with all its channels per pixel to a class of output pixels. Calibration during training aims to map the correct output pixel class to the input with all its channels.
Thanks for the reply, but can you explain, why it is 23 in the assignment, not 46 or others? @reinoudbosch.
I really dont get it, how to select class numbers, meaning here the 23 Or for each channel there’re 23 selections?
The default is set to 23, but it could indeed be 46 or any number of classes that are to be distinguished.
As you can see in the code of unet_model, the final layer has n_classes filters. Applying these filters leads to the final n_classes channels, with each channel indicating the probabilities of a class. As this is a 1x1 convolution after upsampling, the channel values indicate the probabilities of each pixel belonging to a particular class. So the correct class for a pixel is that channel that has the highest probability for that pixel.