Dear:

I have a question on the output for boxes, I know the shape of boxes is (10,) because we have 10 anchor boxes, and the boxes[2] has four values which corresponds to coordinates but I want to ask why the value here is so big, and even negative, what does these values exactly shows the coordinate of the box, could u please show a picture of the meaning of these values? thanks a lot

boxes[2] = [-1240.3483 -3212.5881 -645.78 2024.3052]

and also when we do the code below:

image, image_data = preprocess_image(“images/”+“test.jpg” , model_image_size = (608, 608))

yolo_model_outputs = yolo_model(image_data)

print(yolo_model_outputs[0,0,2,0:5])

I got the following result:

<tf.Tensor: shape=(5,), dtype=float32, numpy=

array([ 0.02702259, -1.5739233 , 0.6303474 , -1.9991553 ,

-11.171314 ], dtype=float32)> which represents the value of x y h and w

so my question is why do we have a negative y here where professor Andrew said the value should between 0 and 1.

Hey @jiacheng_Cui,

These values indeed represent the coordinates of the corners of the boxes selected by the `yolo_eval`

function. However, please don’t try to relate them to any sort of physical significance, such as why these are large or negative.

This is because, these values only come for the set of test values, which in themselves are **flawed**. If you take a look at the test cell after the `yolo_eval`

function, you will see something like:

```
yolo_outputs = (tf.random.normal([19, 19, 5, 2], mean=1, stddev=4, seed = 1),
tf.random.normal([19, 19, 5, 2], mean=1, stddev=4, seed = 1),
tf.random.normal([19, 19, 5, 1], mean=1, stddev=4, seed = 1),
tf.random.normal([19, 19, 5, 80], mean=1, stddev=4, seed = 1))
```

Here, the first 2 tensors represent the X-Y coordinates and the Width/Height of the output boxes from the encoding model. Now, these are supposed to be positive, but they have been modelled using Normal distributions, i.e., these tensors consist of negative values as well.

In simple words, these values are only there because they want to test your code, and since the inputs are flawed, hence, these output values are flawed as well.

As for your second question, it is mentioned clearly in the assignment that:

The output of

`yolo_model`

is a (m, 19, 19, 5, 85) tensor that needs to pass through non-trivial processing and conversion. You will need to call`yolo_head`

to format the encoding of the model you got from`yolo_model`

into something decipherable.

So, trying to understand the `yolo_model_outputs`

is not a good use of your time. You need to first pass it through the `yolo_head`

function, and then only, try to understand it. And if you want to understand the `yolo_model_outputs`

at any cost, then you can try to look at the source code of the YOLO’s implementation that has been used in the assignment. I hope this helps.

Regards,

Elemento