Could not understand how training 1B parameters requires 80 GB RAM

GUGULOTHU_PANDU · July 18, 2023, 10:08am

Hi,

In Week 1 video lecture “Computational challenges of training LLMs” , It is mentioned that
1 parameter = 4 bytes @ 32 bits full precision
1B parameters = 4 GB

Extra 20 bytes are required for other things(activations, gradients… etc).

How come 80 GB RAM required for 1B parameters training? Isn’t it supposed to be 24 GB with 20 bytes of extra memory for parameter.

Please let me know if I am missing something here.

Thanks,
Pandu

rmwkwok · July 18, 2023, 11:29am

That would be the data (the inputs) and the embeddings? You need memory to store the things that are multiplied to the weights during training as well, instead of just storing things that are associated with the weights.

Juan_Olano · July 18, 2023, 11:58am

WOW!!!

Hi @GUGULOTHU_PANDU ! I think you may have found an error in the lecture! I’ve done my math, and crossed check with other searches and your numbers seem to be right! We would need more like 20Gb of RAM for 1B params!

I will raise this point internally and share any outcome with you.

Thanks!

Juan

pjpatel012 · October 3, 2023, 2:38pm

@Juan_Olano What was the result of this? Does this mean if I wanted to load in the 1B parameter model, out of the box locally with no quantization, I would need about 20-24GB of RAM? I’m tempted to allocate the difference (80GB - 20GB = 60 GB) to full-fine tune, because in Week 2, its clearly stated that a full copy of the model is created for every task.

nehapawar · October 24, 2023, 8:43am

Hi, I have rewatched this part of the video a few times to get a better understanding. It isn’t clear to me completely either. The only thing that could hint to justifying the 80GB for a 1B parameter model would be something said on these lines in the video: Due to the overhead that occurs during the model training, we would need 20times more memory for training as compared to the model size. Can a mentor please confirm and make sense of this? What is the overhead?

nehapawar · October 25, 2023, 6:27pm

Hi Juan, do you have an update on this?

chris.favila · November 16, 2023, 8:33am

Hi everyone. This will be corrected soon. The number should only be about 24GB instead of 80GB. Thank you for reporting!

Topic		Replies	Views
Questions about "GPU RAM size needed to train 1B parameters" Generative AI with Large Language Models week-module-1	8	3261	February 20, 2024
Loading Model - Memory Requirements Generative AI with Large Language Models week-module-2	1	420	October 30, 2023
Week 1: Computational challenges of training LLMs Generative AI with Large Language Models large-language-model , llm	3	40	November 18, 2024
Couple Questions From Week 1 Generative AI with Large Language Models week-module-1	1	383	October 2, 2023
When we upgrade to a better GPU (V100 32G), how should we adjust the training parameters? I have come across some issues Finetuning Large Language Models	0	82	October 7, 2023

Could not understand how training 1B parameters requires 80 GB RAM

Related topics