Precision/Recall on which data?

naveadjensen · November 10, 2022, 5:15pm

In the Skewed data sets section of Class 2, week 3 when precision/recall is discussed, I might have missed it, but I am curious which data is usually used to present this information, the training, cross validation, or the test data? It might be useful to look at this curve for the training and cv data while you are tuning the hyper parameters of the model (alpha, lambda, et cetera), but it seems like it would be more appropriate to look at and show this plot while using the predictions from the test data, after you have already tuned your model.

Question:
Is there a usual way this is done in the industry? Any thoughts are appreciated.
Thanks.

TMosh · November 10, 2022, 5:25pm

In general, the process is that you train on the training set, using the validation set results to adjust the training parameters, and then use the test set only for a final spot-check of the completed system performance.

rmwkwok · November 11, 2022, 7:04am

Hello Navead, how are you?

You may present precision/recall values on test data to anyone including your company’s CEO.
You may present precision/recall values on cross validation data to your fellow data science colleague to discuss which candidate model is the best.
You may present and compare precision/recall values on training and cv data to decide how to tune your hyperparameters. However, it might not be a good practice to tune the probability threshold value with precision or recall because of the arguments here.

Agreed!

Cheers,
Raymond

Topic		Replies	Views
Picking the threshold point in the recall/precision curve Advanced Learning Algorithms week-3	5	525	November 12, 2022
Why do we need to have a validation set for training? Advanced Learning Algorithms week-3	17	866	February 8, 2023
Test set and Validation set Advanced Learning Algorithms week-3	10	517	January 15, 2023
Should I train cross validation and test data after training the model with training data? Advanced Learning Algorithms week-3	3	481	March 20, 2023
Estimating variance Advanced Learning Algorithms week-3	2	14	August 28, 2024

Precision/Recall on which data?

Related topics