Bounding boxes for the new images

The training set, and so also the validation set and the test set, have bounding boxes given. But when you use the CNN model in the real world the incoming images do not have bounding boxes. How does that work?

If we feed the real-time images with a bounding box, then what’s the purpose of training the model? The trained model learned to recognize objects in images without bounding boxes by using the features it learned from the training dataset.

The bounding boxes are only used during training, to help the model learn which features of the images are important.