A doubt in Week 3 U-Net Assignment

Dear Mentor,

Could you please guide me the following issue?

There is a sentence quoted from Section 3.1 in
Week 3 Assignment - Image_segmentation_Unet_v2

“Final Feature Mapping Block: In the final layer, a 1x1 convolution is used to map each 64-component feature vector to the desired number of classes. The channel dimensions from the previous layer correspond to the number of filters used, so when you use 1x1 convolutions, you can transform that dimension by choosing an appropriate number of 1x1 filters. When this idea is applied to the last layer, you can reduce the channel dimensions to have one layer per class.”

image

May i know how to understand this sentence “When this idea is applied to the last layer, you can reduce the channel dimensions to have one layer per class.”?

Let say there are 11 classes. Shouldn’t it be only 1 output (softmax) layer with the possibility of 11 classes?

Thank you.

Hello @JJaassoonn, if we have 11 classes, then each pixel has 11 logits (or 11 probabilities) , right?

Dear Mr Raymond,

Yes, i agree. Let say there is an image of 128 x 128 pixels, each pixel has a vector of 11 entries, each entry represents the probability of which belongs to 1 of 11 classes.

When this idea is applied to the last layer, we can reduce the channel dimensions from 64 to 11 to have one layer per 11 classes.

Should it be more accurate if i modify the sentence like this?

Yes, so using the word of that assignment, we therefore need 11 layers. This “layer” does not mean 11 tensorflow layer objects, but it means like 11 channels in the output layer.

I think this is fine :slight_smile:

Thank you so much, Mr Raymond

You are welcome, @JJaassoonn.