Limitations of Pythia70M

devang_pagare · October 19, 2023, 1:48am

As per the course, we used pythia-70M model and fine-tuned it on lamini_docs.jsonl data which was a QnA dataset. Some of the limitations I found on this model finetuning are -

After finetuning, when we give a question from the dataset, the model generates the output. But after the first few sentences, the model repeats the sentences. II think it is because we set max_output_length = 100, so the model just tries to somehow complete the output size. Is there any way to make the model stop its generation after it’s done with relevant output generation?
After finetuning, I was hoping that the model would be able to answer questions that are not present in the dataset but are related to the Lamini docs. eg. Lamini docs have a total of 1400 sample questions and answers, if we provide a question that is different from those 1400 questions but its answer is present in those 1400 answers, the model is unable to generate the sensible/relevant output for such questions. In my opinion, the model should have learned the context of that entire document and should be able to answer any question related to that document. (I may be wrong, please explain me)

Topic		Replies	Views
The code in lesson 1 does not generate the expected outcome Finetuning Large Language Models	0	98	August 27, 2023
Multilingual LLM finetuning in Greek Finetuning Large Language Models	0	132	August 25, 2023
Enroll in Finetuning Large Language Models! News and Announcements	2	239	August 25, 2023
After completing this short course, how should one proceed to further deepen the Finetuning process? Finetuning Large Language Models	0	99	October 7, 2023
05_Training_lab_student - Error with "Finetune a model in 3 lines of code using Lamin" Finetuning Large Language Models	2	196	August 7, 2024

Limitations of Pythia70M

Related topics