There are quite a few reference threads about YOLO created by ai_curious on the forums that you should read:
Here’s one about how Anchor Boxes are derived. That is a separate learning step from the actual training of YOLO.
Here’s one that discusses how anchor boxes are used.
Here’s one about training YOLO.
There are more such threads which a little searching can find. The Discourse search engine works pretty well.
On the question of how the output layer works, if they talk about the fully connected layer in the paper, don’t they also say how they get to the final output? If not, we have the model imported into the notebook. You can print the “summary()” of it and see what the last few layers look like.