Hi to all!
I made LAB#2 on my Macbook without any problems at all. When I got VM with GPU from local cloud provider I really get fantastic quick training on step 2.2 but after I save trained model and load it again - I began to get mistakes on step 2.3:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)
So if I understand right - I made all previous steps with GPU but after this three commands everything breaks:
trainer.model.save_pretrained(trained_model_path)
instructmodel = AutoModelForSeq2SeqLM.from_pretrained("./trained_model",torch_dtype=torch.bfloat16)
tokenizer.save_pretrained(trained_model_path)
Can someone help with it?
@Kirill_Naumenko ,
HI, typically this issue occurs when you are trying to perform an operation between two tensors that are not on the same device. For example
Blockquote
tensor1 = tensor1.to(‘cuda’)
tensor2 = tensor2.to(‘cpu’)
To avoid this you can check a device variable in the beginning of your program and use it in the entire program.
Keep learning!
Hi carlosrl!
Thanks. You are right about it.
I made some investigations before post topic.
And tried to understand how I can work with different devices. I checked that my device is GPU by default:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device
device(type='cuda')
But I have some problems with understanding what is tensor from our lessons. We did not study this term and I made the assumption that model = tensor and made such changes in row:
instructmodel = instructmodel.to(device)
trained_model_path="./trained_model"
trainer.model.save_pretrained(trained_model_path)
instructmodel = AutoModelForSeq2SeqLM.from_pretrained("./trained_model",torch_dtype=torch.bfloat16)
instructmodel = instructmodel.to(device)
tokenizer = AutoTokenizer.from_pretrained(trained_model_path)
But got the same error on step 2.3:
input_ids = tokenizer(prompt, return_tensors='pt').input_ids
original_model_outputs = original_model.generate(input_ids=input_ids, generation_config=GenerationConfig(max_new_tokens=200, num_beams=1))
original_model_text_output = tokenizer.decode(original_model_outputs[0], skip_special_tokens=True)
instructed_model_outputs = instructmodel.generate(input_ids=input_ids, generation_config=GenerationConfig(max_new_tokens=200, num_beams=1))
instructed_model_text_output = tokenizer.decode(instructed_model_outputs[0], skip_special_tokens=True)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper__index_select)
The interesting thing is that by loading the trained model I break the outputs also of the original model
I think my mistake is not understanding threads inputs/outputs. and after loading the model from the file system after training in step 2.2, I need to set somehow that all flows go to the GPU as before
device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)
at the beginning.
Adding “.to(device)” to input_ids solved me the same problem:
input_ids = tokenizer(prompt, return_tensors=“pt”).input_ids.to(device)
each time input_ids is instantiated.
Regards,
Magüi
1 Like