Is it logical to mix and match traing and test sets once model is selected?

hi
once a model is selected with hyperparameters, should we re-train with all the dataset before sending it to real world?
I have read this in a book and it looked logical to me since more data means less overfitting.
I wonder your insights on the topic

here is the quote from the book(Corey Wade-hands on gradient boosting)

'When testing, it’s important not to mix and match training and test sets. After a final model has been selected, however, fitting the
model on the entire dataset can be beneficial. Why? Because the goal is to test the model on data that has never been seen and
fitting the model on the entire dataset may lead to additional gains in accuracy.

Makes sense to do as said in the book.

1 Like

I agree, but wonder if it is best to train from scratch on the combined train+val+test, or to treat it more like a transfer learning problem; refining the weights learned on train set with the likely smaller val + test sets? Seems like starting from scratch introduces the possibility of going off the rails with no way to assess except when it underperforms in production.

2 Likes

Yes it should be transfer learning, after all as you say it might overfit and under perform in production. At least with the val/test splits you are avoiding some overfitting for this dataset.