The lecture touches upon how to go about defining the cost/loss function for obj detection but the proposed method doesnt factor the constraints on optimization… i.e the fact that x,y are bounded and bw,bh are non-negative… i believe the proposition is to use a ols kind of cost for x,y,bw,bh… there also seems to be a issue with using a ylnp kind of cost for probabilities if it doesnt factor the constraints that sum of class probabilities should always be 1 and that individually those probabilities should lie between 0,1… can someonw shed more light on the kind of cost function used so that these constraints are never violated?

If your doubt still exists, please point to the lecture and time in question. Thanks.