Not sure what you mean by this, but computing the center of the box is pure geometry of course. So I assume you mean finding the centroid of each recognized object, so that it can pick which grid cell the object is assigned to. It just learns that through training. It requires a huge amount of data to train an algorithm like YOLO and all the data is labeled with all the info including object types and bounding boxes. You have a loss function which is a hybrid function, since it needs to deal with classifications as well as regression style outputs. Prof Ng does not really discuss how the training works, but there are a number of very detailed threads here on the forum about YOLO. Here’s one that covers the training.
They are learned a priori using a different algorithm. Here’s a thread about that. And here’s a thread about how they are applied.