This is asked and answered many times already. Try some searches in this forum with YOLO as the keyword. You might also find this thread-> [Applying YOLO anchor boxes] useful. There are several other related threads linked from it. The shortcut /spoiler ideas are 1) the network input is the entire image, not the grid cell subregions. 2) Object center location is provided as part of the training input. Subsequently it is part of the network output (predictions)
This thread also might be helpful…[Quick question regarding YOLO algorithm - #3 by ai_curious]