YOLO - How does Bounding box get identified when Object spawns multiple sliding windows(Grids)

Prakash_Janjanam · November 25, 2021, 12:08pm

Hello Mentor Team,

Good day!! I have been trying hard to visualize the concept of Object classification with Localisation concept used in Yolo algorithm, and how are the bounding boxes identified for cross cutting objects across grid cells.

My understanding from the Lecture(Video-Bounding Box Predictions) is that each slice of the Image (determined by the 3X3 or 19X19 grids) will go through the convolution net to figure out whether an object exist, which one it is and where does it exist.

In Yolo, while there is optimisation of doing all windows in one go through shared computing, my question is, if the object is cutting across 2 grids ( or 4 grid cells in the worst case), how does the bounding get identified?

Each slice of the image that’s convolved is only a part of the car, how does the object across grids get combined and a mid point identified? Will be great if someone can throw some light on this. I hope I am able to frame my doubt clearly.

Thanks in Advance,
Prakash Janjanam

ai_curious · November 25, 2021, 1:07pm

Several threads in the forum cover this. Maybe take a look and tell us what you find?

[Week 3 Yolo Doubt About Sliding Window - #3 by ai_curious]

[Quick question regarding YOLO algorithm]

[[C4W3] YOLO grid question]

[Detecting Multiple Objects using YOLO - Grid Cells plus Anchor Boxes]

The tldr is that grid cells in YOLO are not sliding windows and unlike sliding windows, YOLO does not actually divide up the input image into subregions. The grid cells represent sets of predictions, each of which is made concurrently and each of which uses the entire input image.

Prakash_Janjanam · November 25, 2021, 1:56pm

@ai_curious , Thank you for pointing me to some possible resources, I shall go through and understand better.

Topic		Replies	Views
Course4 Week3: Understanding YOLO Algorithm Convolutional Neural Networks	5	814	March 18, 2025
How does YOLO know if 3 cells make 1 object? Convolutional Neural Networks	3	605	August 14, 2023
How does a cell detect a bounding box bigger than itself, YOLO? Convolutional Neural Networks	6	823	July 10, 2021
YOLO vs Convolutional Sliding Window Convolutional Neural Networks week-3	13	576	September 29, 2024
YOLO Algorithm and grid cells Convolutional Neural Networks week-3	11	81	March 19, 2025

YOLO - How does Bounding box get identified when Object spawns multiple sliding windows(Grids)

Related topics