Measuring distance between objects against a certain threshold

Hello all, Im doing a project where Im detecting two objects in my image and measuring the distance between them, and then checking if its meets a certain threshold.
Ive already experimented with object detection models then I take the bounding boxes and try to measure distance using image processing.

My questions are:

  1. What metrics can I use to test for this problem? (just recall and precision where TP is the correct prediction? ) or is there a better metric I can use?
    2.what metrics can I use for comparing my distance calculated to the actual distance between objects ?

  2. how can I get the best accuracy knowing my inputs are just 2D images so issues like scale and depth are present? (I was thinking of either combining a depth model results with detection model result, would that help?

Your opinions are appreciated.