I wanted to share with you some thoughts on a real practical application of object detection.
I have pictures, taken from above, of piles of stones in fields. I need to estimate the approximate weights of those stones based on picture.
I divided this into 2 subproblems: a) Image detection to quicky identified the space occupied by stone on a picture → will provide the surface b) Statistical processing, to get the average the weights. In this case, the user will provide with each photo, the parameters of depth of the pile of stone to assess its depth and the name of stone to estimate its density. The approximate weights will be equal to the surface * the depth * the density
- Image detection:
Given the fact that stones might overlap on pictures, a sliding window detection may not work well. The approach chosen will thus favor image segmentation.
1.1 Preprocessing: The image are “Orthophotos”, thus it is possible to scale them all to the same scale, in order to make sure that for example 1 cm on one photo will be equal to 1 cm on another photo. In other words, pixels on different pictures will have the same scaling value
1.2 Image segmentation: When pictures are preprocessed to make sure they have an identical measurement scale, then image segmentation can be applied.
Question:
Using a pre-trained network might be a good way to go, but I am not sure which pre-trained network should be used.
Second, this pretrained network will still have to be fined tuned. WIll that be enough if I just input a lot of pictures with piles of stones, or should I need to input single stone pictures ? The latter might be more time consuming to collect, but I guess the image segmentation network might do a better job if it has the ability to recognize each single stones.
1.3. Calculate the surface: An hot spot area on the picture will be identified by image segmentation as being a pile of stone. Then I need to get the surface covered by this hot spot.
I guess by taking the sum of the number of pixels having a zone higher than a certain threshold (defined by image segmentation hot spot) will give me a number. Knowing the scale of 1 pixel on an orthophoto previously scaled, I will multiply both to get the surface.
Question
Am I missing something, has anyone a better idea ?
Finally, having now the surface, it can be multiplied by the parameter given by the user which are depth and stone density to get the approximate weight of a pile of stone on a picture.
Thanks guys,
Best,
Manu