Week 1: Pretraining Large Language Models

agarwalamit081 · November 17, 2024, 12:09am

Generative AI With LLMs
Week 1: Pretraining Large Language Models

Significance of scale
The larger a model, the more likely it is to work as needed to without additional in-context learning or further training.
This observed trend of increased model capability with size has driven the development of larger models.

Is there more info or links to papers which proves this? Is it simply that with more data, we are better fitting the previous under-fit models?

Do we have metrics to check how well these LLMs are fitting? Can we determine accuracy, Precision, Recall, F1-Score etc. for the LLMs?

Does it all boils down to how the word/sentence embeddings are computed? Shallow Bag-of-Words models (Latent Semantic Analysis, Latent Dirichlet Allocation, Term Frequency-Inverse Document Frequency) vs. neural network based doc2vec/BERT/GTP models?

gent.spah · November 17, 2024, 7:00am

There are studies out there just search in google, but yes the underlying point is that with more data you have a better fitting.

Yes there are metrics, those you mentioned also rouge scrore…you will see how they are used in the coming weeks.

Not just that but it is also a crucial part, also the transformers architecture is an evolutionary very crucial step.

Topic		Replies	Views
Couple Questions From Week 1 Generative AI with Large Language Models week-1	1	383	October 2, 2023
Week3 - I have just completed the course, excited to put my knowledge into practice! Generative AI with Large Language Models week-1	2	43	October 15, 2024
Week 1: Computational challenges of training LLMs Generative AI with Large Language Models large-language-model , llm	3	38	November 18, 2024
Generative AI with Large Language Models News and Announcements	8	2971	July 3, 2023
Theory into practice: Generative AI lifecycle GenAI with LLMs Resources	8	677	July 21, 2023

Week 1: Pretraining Large Language Models

Related topics