No systematic solution for data mismatch. What about transfer learning?

Marton_Helli · December 26, 2024, 9:15pm

During the error analysis videos, when lots of synthetic data and low amount of target data is available, why pre-training on synthetic and fine-tuning on the target data is not mentioned as a possible approach? Sounds like the same problem mentioned in the transfer learning chapter.

Attila_Ambrus · January 15, 2025, 11:02pm

Transfer learning works really well when there is a significant overlap at the level of characteristics between the pre-training (source) data and the target data. If the synthetic data only partially cover the characteristics of the target data, prior learning may not be ideal. Instead, methods such as data blending or augmentation may be more effective in such cases.

SNaveenMathew · January 17, 2025, 2:56am

The objective of pretraining is to obtain “generic” parameters that capture something like generic features - for example, in LLM pretraining it’s assumed the data set provides enough information about the structure of language and grammar. Using synthetic data for this purpose may or may not work - “teacher forcing” is one of the main affected steps if the synthetic data has a lot of “hallucinations”.

There are no major concerns with fine-tuning on your target data set except the data size/variety - if you train too many epochs on a small data set or on a data set with limited variation, the adapter layers are likely to memorize the target data.

Topic		Replies	Views
Using Transfer Learning to deal with Data Mismatch Structuring Machine Learning Projects	1	560	May 31, 2021
Training Data Ideal Approach for Transfer Learning Convolutional Neural Networks in TensorFlow week-3	2	516	January 21, 2023
How can transfer learning be applied when using a dataset that is not trainable with the pre-trained model? AI Discussions ai-discussions	2	88	March 6, 2024
Transfer learning why it works? Advanced Learning Algorithms week-3	2	44	February 6, 2025
Transfer learning and hyperparameters Structuring Machine Learning Projects	1	548	April 24, 2022

No systematic solution for data mismatch. What about transfer learning?

Related topics