Training NLPC3W3 Assignment on Ubuntu laptop CPU w/ 12 cores

arvyzukai · October 12, 2023, 6:02am

Hi @PZ2004

I’m no expert on this so what I usually do is I run some test runs and check the memory footprint and also the time to complete an epoch to get a rough estimation how long will the model will train that is pretty lame “strategy” but it is what it is…

From what I understand from your post is that you should not have problems fitting the model into memory (which is usually the biggest problem in general) but the training time should be quite long (slow training).

I’m not sure if you have the access to Generative AI course lecture Computational challenges of training LLMs (if you don’t, it doesn’t matter - the main point is discussed here, and btw I also think that the calculations are off but a follow up might reveal the truth).

Again, I’m no expert, but in my experience, small/intermediate models (less than 300M parameters) are not a problem to train on GPUs, but for CPUs I think that would take ages.

Cheers

Topic		Replies	Views
Configuration for Trax-jax[cuda11] on linux(WSL2) NLP with Sequence Models week-1	1	384	August 17, 2023
No GPU in labs. Should there be one? NLP with Attention Models week-1	15	599	May 7, 2023
C3_W4 assignment Question duplicates NLP with Sequence Models week-4	1	512	June 12, 2023
Different Data Types in Local Notebook NLP with Sequence Models week-2	3	493	March 27, 2023
Problem with installation of Trax on Mac NLP with Sequence Models week-1	5	553	April 3, 2023

Training NLPC3W3 Assignment on Ubuntu laptop CPU w/ 12 cores

Related topics