U-Net: Combining Final Output Into a Single Image

themightywolfie · September 13, 2024, 6:05am

At the very end of U-Net, we get a vector that is h * w * n_c where n_c is number of classes. We combine each channel into a single image to get the segmented output.

How exactly do we combine the channels? There is something said about taking the max

Can someone please elaborate on this? Do we give priority to pixels in the training set? Let’s say we want the car to be at the front so we assign car the highest pixel activation value and just take the maximum activation value of the pixel across all channels?

Am I correct in my understanding?

gent.spah · September 13, 2024, 9:39am

As far as I remember every pixel of the image belongs to a class, there is no priority list. In the training phase you have images and segmentation maps where the model learns to classify the pixels in the right class according to the segmentation map.

The arg max here means take the class with maximum probability that comes of the model for each pixel, and that class is assigned to that pixel ultimately.

paulinpaloalto · September 13, 2024, 2:55pm

Right! The output for each pixel is a softmax distribution across all the possible classes. That is how the prediction is expressed. To translate that to a categorical class, we just need to take the argmax. That’s how it always works in multiclass classifiers, but the new and salient point here is that we’re classifying every single individual pixel in the image, instead of the usual image level classification.

Well, we actually do the usual from_logits = True mode on the loss function here, so the prediction outputs of the model as written are the raw inputs we could feed to softmax manually. But softmax is monotonic, so just taking argmax of the logits gives you the same answer. It’s up to you which way you choose to implement that.

Topic		Replies	Views
Course 4 Week3: U Net Assignment Doubt: Details about Preprocessing Convolutional Neural Networks coursera-platform	5	611	June 27, 2023
Loss in semantic segmentation - C4 - Week 3 Convolutional Neural Networks coursera-platform	1	500	May 24, 2022
DLS Course 4 - Week 3: U-Net Image Segmentation Assignment Convolutional Neural Networks coursera-platform	4	522	November 14, 2022
Semantic Segmentation with U-Net Convolutional Neural Networks coursera-platform	1	511	October 14, 2021
Where is the activation function in Week 2 - Transfer Learning assignment Convolutional Neural Networks coursera-platform	6	520	July 10, 2022

U-Net: Combining Final Output Into a Single Image

Related topics