A clarification about Image Classification and Localization Algorithm and YOLO

Twaijri · August 28, 2022, 5:30pm

Hello!

how can the “Image Classification and Localization Algorithm” localize the boundaries of any of these cars while a part of their “actual” boundary is outside of the grid cell (which the algorithms shouldn’t be able to see)?

Thanks

paulinpaloalto · August 28, 2022, 6:56pm

YOLO is by far the most complicated system we’ve seen so far, so it’s no wonder that it takes some serious headscratching to understand. The point is not that the algorithm can’t see things outside of the current grid cell: the grid cells are just used to organize the computation. A given object will be reported only for the grid cell that contains its centroid, but there is no requirement that the bounding box of the object lies completely within the grid cell. The bounding box “is what it is”. Over the next couple of lectures and in the assignment, you’ll also see how they deal with the fact that the same object can be reported multiple times with slightly different bounding boxes. In all this, Prof Ng doesn’t really say much about how all this complexity gets trained, but it’s a safe bet that “it’s complicated”.

If you have more detailed questions about any of this and want to go deeper, there are some great threads from fellow student ai_curious who has done some serious work using and studying YOLO and then writing about it. Here’s a good one to start on and this one is more specific to the question of multiple bounding boxes.

ai_curious · August 28, 2022, 8:27pm

And the fact that YOLO can handle bounding boxes larger than a grid cell is a key differentiator and advantage over sliding windows. Sliding windows iteratively chops up the image and can lose parts of objects larger than the window or near its edge. YOLO processes the entire input image all at once. Each grid cell predicts whether an object center is within it, along with the center location and bounding box shape. Perfectly fine for bounding box dimension to exceed grid cell size.

Topic		Replies	Views
How does a cell detect a bounding box bigger than itself, YOLO? Convolutional Neural Networks coursera-platform	6	829	July 10, 2021
YOLO Algorithm and grid cells Convolutional Neural Networks week-module-3 , coursera-platform	11	91	March 19, 2025
YOLO algorithm bounding boxes car detection Convolutional Neural Networks coursera-platform	1	610	January 23, 2022
Questions about YOLO Convolutional Neural Networks coursera-platform	13	2459	January 23, 2025
YOLO - How does Bounding box get identified when Object spawns multiple sliding windows(Grids) Convolutional Neural Networks coursera-platform	2	733	November 25, 2021

A clarification about Image Classification and Localization Algorithm and YOLO

Related topics