I was watching one of the videos and how b_h and b_w are defined during labeling.
In the image above , as said in the video, b_h is 0.3 and b_w is 0.4. This means that the width of the object is 30% of the width of the bounding box and 40% of the height of the bounding box. But what about in the image below (cat)?
If I want to label row 3 column 2. What would be the values of b_h and b_w? Is b_h and b_w equal to 1.3? I don’t understand how a box is labelled when the target image is bigger than the box itself.
Not quite. First, b_w and b_h are the shape of the bounding box, not the object. In this case it is a ground truth bounding box, I think (not a predicted one). In the first image, the one of the truck, 0.3 and 0.4 are ratios of the ground truth bounding box shape to the entire image shape.
In the second image, which has added the concept of grid cells a la YOLO, the values of b_w and b_h represent ratios of the ground truth bounding box to the grid cell shape. b_w = 1.0 is thus a bounding box exactly the same width as one grid cell. 1.3 means 30% larger than one grid cell.
There is a much more elaborate examination of YOLO grid cell and bounding box relationships in these previous threads: