Questions on Yolo Algorithm

ai_curious · February 21, 2024, 12:43pm

Here are some thoughts related to your questions and assertions. Hope it helps.

YOLO doesn’t draw bounding boxes: it predicts bounding box center location and shape. Whether they end up being visualized or not depends on what the application is being used for. If it’s driving an autonomous vehicle, for example, likely not.

It is common to hear or read that YOLO divides the input image into grid cells. This is not precisely correct. The image isn’t divided at all, which is where the name comes from…you only look at (input) the image once. The entire image is input to the CNN and processed through the forward propagation exactly once. What is divided is the ground truth training data Y and the output predictions \hat{Y}.

One needs to be careful making assertions about what YOLO does or how it works, because there are many versions out there and they don’t all work exactly the same way. I think v1 did make one class prediction for all the predicted bounding boxes in a given grid cell, whereas with v2 there was one class prediction for each predicted bounding box in a grid cell.

It is correct that a predicted bounding box can be larger than a grid cell. There are threads already in the forum that describe the math of how and why. It is important to connect this idea with the fact that the entire image is the source of information to each bounding box and classification prediction, not just a grid cell shaped subregion of it. Also, again, it is object center location and shape that are being predicted by the YOLO neural net, bounding boxes are not drawn by it.

This related thread has more background. You can find others like it using Search.

Topic		Replies	Views
Course4 Week3: Understanding YOLO Algorithm Convolutional Neural Networks coursera-platform	5	816	March 18, 2025
YOLOv1 Research Paper Convolutional Neural Networks coursera-platform	9	557	July 10, 2021
YOLO Algorithm and grid cells Convolutional Neural Networks week-3 , coursera-platform	11	90	March 19, 2025
A clarification about Image Classification and Localization Algorithm and YOLO Convolutional Neural Networks coursera-platform	2	717	August 28, 2022
YOLO - How does Bounding box get identified when Object spawns multiple sliding windows(Grids) Convolutional Neural Networks coursera-platform	2	731	November 25, 2021

Questions on Yolo Algorithm

Related topics