Unet error / back prop question

If I understood correctly, the training data for a unet is an image and its masked version. After the various stages of decoding/encoding in the unet, we need to estimate an error and backprop etc …

I may have missed it, but I’m not sure how the unet computes error for what it predicted vs the training masked image. Does it check if each pixel has been classified correctly compared to the training data? Or?

Also, I’ve sort of lost track/sense of what exactly back prop is tuning in the Unet. I could use some help clarifying this.

Thanks!
Nidhi

Hi, @Nidhi_Sachdev!

For semantic segmentation (your case), you try to train a model that predicts a mask that is very similar to the training mask. The error between those masks (the predicted and the original) is computed with the loss function, like any other training. In this case, you can use Dice loss, Jaccard loss, etc. With that loss, the backpropagation calculates and applies the gradient to the weights of the model in each optimizer step.

It’s an interesting question! They show you in the notebook how to do this, although the explanation leaves a lot of detail to the reader’s imagination. See section 3.6 - Loss Function in the notebook. They use the TF spare categorical cross entropy loss function. The output is softmax at the pixel level, so they are applying the softmax version of cross entropy at the pixel level. If you look at the main documentation page for SCCE loss, it also links to this page about the “reduction” function, which makes it clear that it can handle inputs with the dimensions (batchsize, height, width, number of classes). At this level they don’t really say how the gradients work, but presumably they either average them or sum them across the samples dimension but apply them “pixel wise”.

1 Like

Thanks for the pointers. That helped clarify things a bit wrt the loss functions. But not so much the gradients. Probably working more on the programming exercise will help me internalize this better.

Thanks!
Nidhi

We are using the TF’s automatic gradient calculations here, so I don’t think you’ll get any more details on how the gradients actually work by looking at what is shown in the notebook: that action is happening “under the covers”.

Hi, @Nidhi_Sachdev!

As @paulinpaloalto said, that process is typically hidden to the user and works “under the hood”. In case you are curious, you can check this article from the tensorflow documentation.

Noted. I’ll read up more after completing the notebook. At this point, I’m still struggling a bit with the 10k foot view :slight_smile: