Pre-training for Adaptation

Ajay_Bawa · July 30, 2023, 8:45pm

Is Pre-training for Adaptation as explained in the Week 1 with BloombergGPT different from fine-tuning?

Even the BloombergGPT was trained on 49% public data (non-financial) and 51% public+private data (finance specific). With that why don’t we call it fine-tuning vs. calling it pre-training.

Thanks,

rmwkwok · July 31, 2023, 12:23am

Hello @Ajay_Bawa,

We pre-train a base model out, and we adapt the base model to a particular use case through fine-tuning. When we pre-train, we can start from no model. When we fine-tune, we start from a pre-trained base model.

Cheers,
Raymond

Ajay_Bawa · July 31, 2023, 1:35am

Yes, that’s how I understand too. In the case of BloombergGPT, was their a foundational model that was pre-trained OR a new model written from scratch?

Juan_Olano · July 31, 2023, 2:02am

Most probably BloombergGPT was written from scratched and pre-trained with the vast amount of information that Bloomberg has access to.

You see, the thing with these models is that it takes really little lines of code when compared with traditional software. The challenge with the models is:

The data. You need huge amounts of data to train a 50-billion paramters model.
The compute resources to train them. You need lots and lots of GPUs to train a billions-parameters model.

They used over 700 billion tokens for this training.

rmwkwok · July 31, 2023, 3:55am

I think you can read their paper, and to me, I think it is from scratch, and I think @Juan_Olano has listed some good points that support it.

Topic		Replies	Views
What is the difference between pre-training and fine tuning in transfer learning? Structuring Machine Learning Projects	3	1192	November 27, 2022
Does BloombergGPT contradict Chinchilla and Llama papers? Generative AI with Large Language Models week-1	4	515	July 7, 2023
Finetuning LLMs Generative AI with Large Language Models ai-discussions	4	32	September 24, 2024
Generative AI with Large Language Models fine tuning checkpoint Generative AI with Large Language Models week-2	3	188	April 19, 2024
Week 2 Lab - what parameters to use to fully fine-tune the model? (part 2.2) Generative AI with Large Language Models ai-discussions	4	29	March 11, 2025

Pre-training for Adaptation

Related topics