HuggingFace AutoTrain process stops soon after training starts

Could it involve with hardware setting?
The local machine is M1 mac.

Hello @beefeaterGin

The error message tells
the parameters you are using is not supplied by you.

Next it tells your install package Nvidia-ml-py is corrupted, so it is telling you reinstall the package. There could also be a version mismatch issue. you need to go through the all the necessary files provided for this model, to use it productively.

Although I am unsure of why is this, probable you don’t have permission to use that file.

is the block_size, model_max_length, mixed_precision, lr and epochs parameters set by you?

Regards
DP

@Deepti_Prasad
Hello, Thanks for your answer. It’s helpful.

Although I am unsure of why is this, probable you don’t have permission to use that file.

I didn’t thought this one. check it with my own file.

is the block_size, model_max_length, mixed_precision, lr and epochs parameters set by you?

I think these’re default values sat by the system.

Try reinstalling Nvidia but make sure the python version of your local Jupyter matches with the hugging face model details as this could also be issue.

Let me know how did it go!!!

P.s. actually even I was trying with some of hugging face data to understand better some of the llama version models, so you query was an instant attention.

Keep Learning!!

Regards
DP

After hours of trying decided to switch to colab as I just wanted to try finetune a model quickly.

Thanks,
BE

1 Like