W3 A1 (Car Detection with YOLO) Max Probability Score

shailesh_dagar · March 21, 2024, 3:47am

In the figure 4 given in the programming notebook, while calculating probability score (pc x ci) for a class i of a particular anchor box in the output of a grid cell of the input image. Why don’t we just take the max of all the class probabilities first and then multiply it with pc to get the probability score of detecting a particular class of object within that particular anchor box? It would save a lot of redundant multiplications (79 in this case).

carlosrl · March 21, 2024, 6:01am

Hi @shailesh_dagar ,
I am not sure if I got your point so, you are free to add your comments
The reason we don’t do what you say is because we want to preserve the individual class probabilities. In YOLO model, each bounding box is associated with a class label and a corresponding class probability. The calculation pc * c, is done for each class, and the class with the highest score is assigned to the bounding box.
If we were to take the maximum class probability first and then multiply it with pc, we would be assuming that the bbox can only contain the object of the class with the highest probability. This would not be correct, as the bbox could contain an object of any class. Moreover, taking the maximum class probability first would not necessarily save computational resources. The class probabilities are computed by the model during the forward pass, and this computation is necessary regardless of if wetake the maximum first or not. The number of multiplications would remain the same, as we still need to compute the class score for each class.
Keep learning!

ai_curious · March 22, 2024, 9:51pm

I’m confused about a couple of these statements

Isn’t that exactly what we want to do, assign one and only one class to each predicted bounding box? We could just assign the max class probability, but here we weight by the object presence confidence, which is class independent. Seems like you can take the maximum value after the multiplication or do the multiplication after extracting the highest confidence class, but in either case, isn’t the numeric result the same? As far as I can tell, there is no loss of information by taking the max first, as the vector of class predictions output by the neural net is still there, unchanged, to do anything you want with. But you would end up with a pure scalar multiplication rather than a broadcast.

If by during you mean once at the completion of then I agree that the unweighted class scores are outputs of the neural network and all the work to produce that vector of outputs is done in the last layer of every forward pass. But this weighting of the raw class score prediction by the object presence prediction isn’t done by the neural net. Rather, it’s a postprocessing step.

I’m team @shailesh_dagar on this one, at least conceptually. I’m not positive that the weighting multiplication described in this part of the notebook is actually a part of the YOLO v2 implementation, so maybe this is a purely theoretical discussion.

Topic		Replies	Views
Week 3 - Assignment 1 - Computation of Class Score: Why multiply Pc with C? Convolutional Neural Networks coursera-platform	18	744	June 4, 2022
[DLS Course 4 W3] Question about class score calculation Convolutional Neural Networks coursera-platform	4	558	September 5, 2023
Week 3 YOLO: why not directly apply threshold to all score, but a max is required? Convolutional Neural Networks coursera-platform	5	643	June 30, 2021
YOLOv1 Research Paper Convolutional Neural Networks coursera-platform	9	561	July 10, 2021
Week 3 assignment YOLO algorithm final output Convolutional Neural Networks coursera-platform	2	570	August 7, 2021

W3 A1 (Car Detection with YOLO) Max Probability Score

Related topics