Hello,
I am working through the assigment for the CNN lecture, and was curious about some clarrification.
For both box_confidence and box_class_probs, conceptually these sould be the probability of having an object and then the probability of each of the classes if present.
This nicely allows for multiplication to get the probability of each class.
If I look within the data, I found the data to be surprising.
I was expecting values from 0 to 1, but am finding large numbers, many that are negative. Can I get some clarrification of what these numbers are if they are not the direct probabilities?
Note, box_class_probs does more like probabilities, and it is box_confidence that I am more surprised by the numbers.
print(box_confidence[10,10,:,:])
tf.Tensor(
[[ 0.07983017]
[ 1.3960358 ]
[-2.839337 ]
[ 4.725943 ]
[-5.2636952 ]], shape=(5, 1), dtype=float32)
Thanks so much
Wil
Those are test values.
Both box_confidence
and box_class_probs
are probabilities.
See for yourself using help(yolo_head)
Help on function yolo_head in module yad2k.models.keras_yolo:
yolo_head(feats, anchors, num_classes)
Convert final layer features to bounding box parameters.
Parameters
----------
feats : tensor
Final convolutional layer features.
anchors : array-like
Anchor box widths and heights.
num_classes : int
Number of target classes.
Returns
-------
box_xy : tensor
x, y box predictions adjusted by spatial location in conv layer.
box_wh : tensor
w, h box predictions adjusted by anchors and conv spatial resolution.
box_conf : tensor
Probability estimate for whether each box contains any object.
box_class_pred : tensor
Probability distribution estimate for each box over class labels.
Right! Notice that in the test cell they used a normal distribution to generate those test values, so the values do not look like what you would get in a real application of this function. But there is nothing in the computations you are doing that fundamentally depend on those values being between 0 and 1, so it is still a legitimate test of your logic. But I would say at a higher level this is just sort of “bad form”: it would have been just as easy for them to use a uniform distribution as the PRNG function for the input data there. I’ll file a bug about that, but it’s not a high priority item. Thanks for pointing this out!
Thanks for all your help again, I’ll take a look at this again tonight.