How can compute power by itself increase model performance?

fernandezpablo · July 19, 2023, 12:02am

On the “Scaling laws and compute-optimal models” video, the instructor says:

The relationship here holds as long as model size and training dataset size don’t inhibit the training process. Taken at face value, this would suggest that you can just increase
your compute budget to achieve better model performance.

Even when data and model parameters are fixed. Having trained ML models in the past (not LLMs) I’m failing to understand how the exact same data and the exact same model (parameters) can improve performance just by throwing more GPUs at the problem.

May it be that, since the model trains using self-supervised learning, model budget/compute power just means more rounds of training over and over the same dataset?

rmwkwok · July 19, 2023, 12:22am

Yes, I think, with more compute budget, we can train more steps. The best would be to read the paper on how they analyze the problem.

Cheers,
Raymond

Topic		Replies	Views
On Scaling Laws and Compute-Optimal Models lecture Generative AI with Large Language Models week-1	2	444	June 30, 2023
Right-Sizing Models for the Dataset: Finding the Best Data-To-Parameter Ratio for NLP Models AI Discussions the-batch , ai-discussions	1	71	May 20, 2023
Week 1: Pretraining Large Language Models Generative AI with Large Language Models ai-discussions , large-language-model , llm	1	41	November 17, 2024
Question about optimal parameters and training dataset for Finetuning Generative AI with Large Language Models week-1	1	388	August 25, 2023
When we upgrade to a better GPU (V100 32G), how should we adjust the training parameters? I have come across some issues Finetuning Large Language Models	0	82	October 7, 2023

How can compute power by itself increase model performance?

Related topics