YOLO concept confusion

My questions of to do with how do the predicted bounding boxes exceed the size of the grid cell, when the network activations are based upon the individual grid cell. I mean everything outside of the grid cell should be unknown to the neurons predicting the bounding boxes for an object detected in that cell right.

More precisely here are my questions:

1. How does the algorithm predict bounding boxes that are larger than the grid cell?

2. How does the algorithm know in which cell the center of the object is located?

This is asked and answered many times already. Try some searches in this forum with YOLO as the keyword. You might also find this thread-> [Applying YOLO anchor boxes] useful. There are several other related threads linked from it. The shortcut /spoiler ideas are 1) the network input is the entire image, not the grid cell subregions. 2) Object center location is provided as part of the training input. Subsequently it is part of the network output (predictions)

This thread also might be helpful…[Quick question regarding YOLO algorithm - #3 by ai_curious]

1 Like