A doubt on Week 3 Assignment - Car detection with YOLO

Dear Mentor,

Could you please guide me on this issue?

I tried to understand this function,
from yad2k.utils.utils import scale_boxes

Function scale_boxes:
Input: boxes, image_shape
Output: boxes
Procedure:
1. Get height from image_shape
2. Get width from image_shape
3. Stack the height and width to be [height,width,height,width]
4. boxes ← boxes * [height,width,height,width]

Let’s say YOLO’s network was trained to run on 608x608 images

The car detection dataset had 720x1280 images – this step rescales the boxes so that they can be plotted on top of the original 720x1280 image.

boxes = scale_boxes(boxes, image_shape = (720,1280))

let boxes = [y_min,x_min,y_max,x_max]

Is it correct to just simply apply element-wise multiplication on these 2 arrays in order to get re-scaling done?

boxes= [y_min,x_min,y_max,x_max] * [720,1280,720,1280]

Thank you

Hey @JJaassoonn,

Yes, the approach described to use the “scale_boxes” function for rescaling bounding boxes is generally correct.

The scale_boxes function should be able to handle rescaling bounding boxes to match the original image size (in this case, 720x1280) when YOLO was originally trained on 608x608 images.

Element-wise multiplication as shown (boxes * [720, 1280, 720, 1280] ) will work correctly for rescaling the bounding boxes if the original boxes were annotated relative to a 608x608 image size. This is because the multiplication applies the appropriate scaling factors for each coordinate.

However, it’s important to ensure that the original bounding boxes are indeed defined relative to the YOLO network’s expected image size of 608x608. If the original bounding boxes are already relative to the 720x1280 image size, you should adjust the scaling factors accordingly.

Hope it makes sense for you now.
Cheers!
Jamal