Programming Exercise - Anchor Boxes


How does these numbers saved in yolo_anchors.txt map to 5 Anchor boxes?

0.57273, 0.677385, 1.87446, 2.06253, 3.33843, 5.47434, 7.88282, 3.52778, 9.77052, 9.16828

I am assuming each two consecutive number represent a Height and Width of a box?

Also, one more question. In the programming exercise, they are using the word Tensor. Is that the same as Vector?

Please try help(yolo_head) to view documentation of anchors.
See this for tensor documentation.

conceptually, the answer is ‘yes’, though I think this implementation of YOLO typically puts width first, then height, everywhere the two are used.

if your second question is referring to places in the code like this

box_wh = box_wh * anchors_tensor / conv_dims

it is named that way to distinguish between where the anchors are read in from the file and used as a Python variable of type List, such as in preprocess_true_boxes(), from where they have been converted to TensorFlow objects, such as in yolo_head(), which happens here

    # Reshape to batch, height, width, num_anchors, box_params.
    anchors_tensor = K.reshape(K.variable(anchors), [1, 1, 1, num_anchors, 2])

NOTE the reason it is converted from simple List of tuples to the anchors_tensor is so that it has the correct shape to be multiplied by the network’s predicted output for width and height, which is (m,S,S,B,2)

For more about where those anchor box dimensions came from, you might find this helpful

This thread summarizes how the anchor box shape influences the predicted bounding box width and height at runtime.

Hope this helps

1 Like