Where does backpropagation start from in a YOLO algorithm?

So, I know that in a YOLO algorithm, the ConvNet first outputs, what are the predicted values of the model. For the sake of clarity, lets say we have an image 608x608x3, and we get the output in the shape 19x19x5x85. In this final output, the model predicts the values of y. This is the first part and complete forward pass through the network.

Now in the YOLO algorithm, the model goes on from here, filtering this output and applying non-max suppression, which gives the final output of YOLO algorithm.

My question here is… does the model stop after the first part, and starts performing backpropagation, updating parameters? Or does the model starts performing backpropagation after going through all the steps of non-max suppression, until the final output of YOLO algorithm?

It has been sometime I have read about YOLO but as far as my understanding goes, the backpropagation happens before non-max suppression, right at the point when you find the initial labels and zones of interest.

Then after you found then some post processing can be done like non-max suppression.

@gent.spah has it right. Backpropagation is part of training, NMS occurs during operational use. During training, you have access to ground truth for each object and detector. Within the loss function, you iteratively compare ground truth to prediction, compute loss, and use backprop to minimize loss. False positive “Duplicate” predictions are dealt with through this process: penalizing mistakes in favor of correct predictions. NMS is run after a forward propagation only during operational use, when ground truth is not available, as a means of suppressing false positives. HTH

ps: it’s a nuance but important to note that for the implementation of YOLO v2 used in this class, the final layer does not use activation functions in the network forward propagation. Instead, the activation functions are applied separately so that different functions can be applied to the different elements of the predictions. Specifically, the shape predictions apply the exponential function to the predicted value, not the sigmoid.

What @ai_curious says is right, he is the expert of YOLO :grin:. Long time hope you are doing well!