Parameter Finetuning

Please can someone explain how to use tthese in finetuning.

per_device_train_batch_size=1, # Batch size for each device (e.g., GPU) during training.
gradient_accumulation_steps=8,

Mr. Zu explained it a bit but I still don’t understand it.

My question

  • What about people who use cloud provideers for GPU when training how do we calculate the number of gpu used in the training. Thank you in advance for a response