Week 1: Computational challenges of training LLMs

agarwalamit081 · November 17, 2024, 3:15pm

Generative AI with Large Language Models
Week 1: Computational challenges of training LLMs

Why are only the weights quantized and not the model parameters?

Also, would it make sense to standardize/normalize the model weights and parameters?
It could harmonize the use of all bits available and help us select the best among FP32/FP16/BF16/INT8.
But maybe what the models learn to do might go haywire unless, the vector embeddings also uses the same format to enable correct encoding/decoding.

Any thoughts on this?

gent.spah · November 17, 2024, 3:54pm

Which parameters because if you mean hyper-parameters, those dont take much space anyway. If you mean the model itself, it does get quantisized when used for example in mobiles, or microprocessors, but of course it reduces model performance and accuracy!

it might be possible if the encodings are not quantised!

agarwalamit081 · November 17, 2024, 5:01pm

I mean the vector embeddings (input and positional encodings).

gent.spah · November 18, 2024, 5:37am

Yes and I mean if those vector embeddings are not quantised like the rest of your model it will be a problem when running the model!

Topic		Replies	Views
Week 1: Pretraining Large Language Models Generative AI with Large Language Models ai-discussions , large-language-model , llm	1	40	November 17, 2024
Couple Questions From Week 1 Generative AI with Large Language Models week-1	1	383	October 2, 2023
Fine-Tuning a Large Language Models with QLoRA and PEFT(LLMs) Generative AI with Large Language Models week-2	11	723	July 15, 2023
Week 2: Instruction fine-tuning Generative AI with Large Language Models llm , prompting	1	57	November 18, 2024
Question on how Base LLMs are trained Generative AI with Large Language Models week-2	4	425	August 3, 2023

Week 1: Computational challenges of training LLMs

Related topics