Doubt regarding YOLO class prediction and anchor boxes

Shriyam_Avasthi · March 17, 2024, 9:50pm

I cannot completely understand the intuition behind the anchor boxes in YOLO. So, according to the papers, each grid cell is responsible for predicting the class of the object inside it. So, the first question comes from here: since the grid is 19x19, won’t it contain too little information to understand the whole context of the object (especially in the case of large objects)? Also, since each cell is predicting the class, how does having different anchor boxes help, as the box will span across multiple grid cells, but the prediction is made based on the content in a particular grid cell (which is completely inside of the anchor box always)? This could only possibly make sense if the whole content inside the anchor box is taken into consideration to predict the class. Is it so?

paulinpaloalto · March 18, 2024, 12:20am

Yes, that sounds right. A couple of key points to make are that a) bounding boxes and anchor boxes are related but they are not the same thing and b) that bounding boxes do not have to be contained within a single grid cell. The grid cells are just used to organize the output by assigning the objects to the cell that contains their centroid.

This is a pretty deep topic. YOLO is by far the most complex algorithm we’ve encountered in any of the DLS courses so far. There are some great threads explaining YOLO in more detail that are worth a look. Here’s one to get started on the role of anchor boxes.

ai_curious · March 18, 2024, 12:52am

I would say this is partly correct. It is a common misconception that the input image is divided into grid cells. It is not. Each grid cell is responsible for localizing (where is it) and classifying (what is it) objects that are predicted to be centered inside it. But there is no requirement that the predicted bounding be contained entirely within one grid cell. Maybe take a read through the several threads that have discussed these concepts in detail and let us know what you learn or what questions remain. Cheers

Topic		Replies	Views
YOLO Algorithm and grid cells Convolutional Neural Networks week-3	11	88	March 19, 2025
How does a cell detect a bounding box bigger than itself, YOLO? Convolutional Neural Networks	6	825	July 10, 2021
What are anchor boxes doing? week 3, assignment 1 Convolutional Neural Networks	5	740	September 27, 2021
Course4 Week3: Understanding YOLO Algorithm Convolutional Neural Networks	5	816	March 18, 2025
How does YOLO know if 3 cells make 1 object? Convolutional Neural Networks	3	611	August 14, 2023

Doubt regarding YOLO class prediction and anchor boxes

Related topics