I have a more theoretical question on how real world data is gathered when constructing class scores. The formula for class scores according to the assignment is:
𝑠𝑐𝑜𝑟𝑒𝑐,𝑖=𝑝𝑐×𝑐𝑖: the probability that there is an object 𝑝𝑐 times the probability that the object is a certain class 𝑐𝑖.
In the exercise both pc and ci values are assigned randomly. In the real world how one would collect this data for each box of an image ? Is there a real algorithm to assign probabilities and create those values ? Or is this data collected manually by a human by inspecting each box and guessing a probability value for pc and ci ?
Training data for object detection has zero uncertainty about where a bounding box is and what is inside it. Whether it is human or machine generated, you start off with an object of known type and location so those probabilities are 1.0
It’s only during prediction, either during training or at runtime, that there is uncertainty. During training, the computed \delta between the prediction and 1. is what drives learning.
This thread actually has a snippet of a groundtruth file and some Python code for converting it to use in YOLO: How to prepare bounding box labels - #6 by ai_curious
The values are only random to create the data for the test case cell. They would not be random in a real-world application.
Concur. The useful takeaway from that section is the behavior related to the shapes. The values are meaningless.