Suppose that, for a particular problem, I studied carefully and came to the terms for the best optimizing metric (a real number). Suppose also that this optimizing metric is very specific to my problem at hand and this metric is not one of the conventional metrics, like accuracy, precision, recall, etc. It is a tailored metric that is best suited for the problem at hand.

Given all the premises above, my question is: 1- Is the best practice in the deep learning industry keeping the cost function for the classificator training as a regular function, like cross entropy, and use the tailored metric only as a means for ranking different classifiers, by computing this metric on the development set?

2- Or is it worth directly implementing the optimizing metric as the cost function? (In this second case, after training, the task of checking the classifiers against the dev set continues to be the same as in the case 1 above.

In other words, my question is basically about the definition of the cost function. Should I bother implementing a tailored cost function, to be close to the optimizing metric, or should I use the optimizing metric solely to check and rank the different classifiers using the dev set?