Is P_c binary or continuous? Difference between P_c of NMS and single object localization?

In the video of Object Localization and Anchor Boxes, the output of algorithm has P_c as a binary value (0,1) to indicate weather there is a object, but in Non-max Suppression P_c is a continuos value that represent confidence. I’m confused about the different usage here.

I want to say to start it is because with anchor boxes their size and shape is predefined-- Thus you either have an object there or not; Whereas in straight object localization the size/shape of the boxes is more flexible, thus you can end up with numerous boxes of different sizes, and thus there are associated probabilities with how well a detected box covers a certain object.

The linked video discusses the values of Y for training data, that is the ground truth. In that case, you know precisely which grid cells have objects and which are empty. (You also know exactly what the bounding box coordinates are, and the correct location in the one-hot vector for classes.) For grid cells (and anchor boxes) that do have object centers assigned, the value of p_c =1.0 NOTE: it is a floating point, not integer. Empty grid cell / anchor box locations are assigned p_c = 0.0 again, floating point not integer. At runtime, the prediction vector \hat{Y} also has floating point values for p_c. During training, the ground truth vector Y and the prediction vector \hat{Y} are compared in the loss function. The better the training, the lower the error, and the closer the predicted p_c is to 1.0, but likely the value will always be 0.0 < p_c < 1.0

Net-net even if p_c is assigned a value of zero or one during training, it is not actually a binary/integer variable. In both Y and \hat{Y} the variables are floating point.

Does this make sense?

3 Likes