[C4W3] YOLO grid question

Artick · August 26, 2021, 11:50am

In the videos, Andrew uses a 3x3 grid over the car image(with the mountain road and snow). With 3x3 the intuition works quite well because the car fits in a cell, but then he says that in practice a smaller grid is used. But, if we overlay a 19x19 grid then each grid cell will contain only a tiny portion of the car. How can the network predict an accurate bounding box since it will just say that the bounding box is the grid cell in which it predicted the car(but the car is inside many grid cells and each cell contains a small part of the object)?

It is my understanding that for each grid cell the network will produce an output vector. Each output vector also encodes a bounding box, which in this case will contain only a small portion of the object.

Thanks!

ai_curious · August 26, 2021, 3:56pm

Maybe take a look at this previous post, and see if it addresses your question?

That may be a lot to digest if it’s your first time really digging in to this algorithm, but it shows the equations YOLO is using to relate predictions to anchor box and grid cell sizes. No other way to really understand it, in my opinion.

The tldr shortcut is that every network output location (m, S, S, B) makes a vector of (1 + 4 + C) predictions based on input from the entire image, thus these predictions are based on input that is not constrained to the specific grid cell or anchor box shape they represent. It is one of the key differentiating aspects of YOLO from sliding windows and other region-based approaches.

Topic		Replies	Views
YOLO algorithm bounding boxes car detection Convolutional Neural Networks	1	609	January 23, 2022
YOLO Algorithm and grid cells Convolutional Neural Networks week-3	11	87	March 19, 2025
How does a cell detect a bounding box bigger than itself, YOLO? Convolutional Neural Networks	6	825	July 10, 2021
Week 3: finding the correct cell in YOLO Convolutional Neural Networks	3	677	January 6, 2023
I don't get it why, when having a 3x3 grid, it can still detect an object that overlaps 2 grids? Convolutional Neural Networks week-3 , ai-discussions	2	22	March 19, 2025

[C4W3] YOLO grid question

Related topics