Week 3 Question About Anchor Box Dimensions

raccooncai · September 14, 2022, 10:55pm

In the “YOLO Algorithm” lecture by Andrew (Week 3), he notes that the target label’s 4th dimension is 5 + the # of classes, with the 5 being {p_c, b_x, b_y, b_h, b_w}. However, if the anchor box information and thus b_h and b_w are already encoded into the 3rd dimension, wouldn’t it be more practical to only put p_c, b_x, and b_y in the 4th dimension?
In other words, isn’t it redundant to include the height and width of the bounding box in the 4th dimension, if that information is already encoded in anchor boxes?

ai_curious · September 14, 2022, 11:39pm

Anchor boxes and bounding boxes, either ground truth bounding boxes or predicted bounding boxes, are not the same things. Anchor boxes are abstract concepts, they represent shapes representative of object shapes in the training data set, but not the shape any one particular object from any particular image. Also, they have only shape, no location. Ground truth and predicted bounding boxes, on the other hand, have both location and shape because they represent specific objects in specific input images. The b_h and b_w values in the prediction output vector are bounding box shape not anchor box shape.

raccooncai · September 15, 2022, 1:45am

That clears it up – thank you!

Topic		Replies	Views
Dimension for anchor boxes Convolutional Neural Networks	5	558	December 27, 2021
Understanding the wide and height label in a bounding box Convolutional Neural Networks	1	617	September 17, 2022
Number of anchor boxes Convolutional Neural Networks	5	678	October 19, 2024
Week 3 A1 Part 1 Tensor dimensions clarification Convolutional Neural Networks	3	504	February 21, 2023
Week 3 - Car Detection Anchor Boxes Convolutional Neural Networks	14	946	July 11, 2023

Week 3 Question About Anchor Box Dimensions

Related topics