Hi there,
I am in the training step particularly try the base model and attached the response here.
I am not sure why I am getting a blank answer, a model is not outputting anything not even an illogical answer.
The Lesson was: Training; “Try the base model”
Well, it seems the output is off which is obvious because I am trying the base model.
However, I fine-tuned for 100_epochs/10,010_max_steps (took ~6hrs) and seemingly it is overfitting. but do you know why it shows 4 outputs as ###Answers and not just plane one statement?
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:0 for open-end generation.
### Question:
Who is the maintainer of the LLM model?
###Answer:
The maintainer of the LLM model is Aeala.
###Answer:The maintainer of the LLM model is Aeala.
###Answer:The maintainer of the LLM model is Aeala.
###Answer:The maintainer of the LLM model is Aeala.
###Answer:The maintainer of the LLM model