Expect all tensors to be on the same device

Alexander_Karstens · June 5, 2023, 7:56pm

I am getting the following runtime error in Lab 1 week 1:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_addmm)

Here is my code:# UNQ_C6 (UNIQUE CELL IDENTIFIER, DO NOT EDIT)

GRADED FUNCTION: get_disc_loss

def get_disc_loss(gen, disc, criterion, real, num_images, z_dim, device):
‘’’
Return the loss of the discriminator given inputs.
Parameters:
gen: the generator model, which returns an image given z-dimensional noise
disc: the discriminator model, which returns a single-dimensional prediction of real/fake
criterion: the loss function, which should be used to compare
the discriminator’s predictions to the ground truth reality of the images
(e.g. fake = 0, real = 1)
real: a batch of real images
num_images: the number of images the generator should produce,
which is also the length of the real images
z_dim: the dimension of the noise vector, a scalar
device: the device type
Returns:
disc_loss: a torch scalar loss value for the current batch
‘’’
# These are the steps you will need to complete:
# 1) Create noise vectors and generate a batch (num_images) of fake images.
# Make sure to pass the device argument to the noise.
# 2) Get the discriminator’s prediction of the fake image
# and calculate the loss. Don’t forget to detach the generator!
# (Remember the loss function you set earlier – criterion. You need a
# ‘ground truth’ tensor in order to calculate the loss.
# For example, a ground truth tensor for a fake image is all zeros.)
# 3) Get the discriminator’s prediction of the real image and calculate the loss.
# 4) Calculate the discriminator’s loss by averaging the real and fake loss
# and set it to disc_loss.
# Important: You should NOT write your own loss function here - use criterion(pred, true)!
#### START CODE HERE ####
#torch.device(device)
random_numbers = get_noise(num_images,z_dim, device)
#print(random_numbers)
fake_images = gen(random_numbers).detach()
#print(fake_images)
predicted_fake = disc(fake_images)
disc_loss_fake = criterion(predicted_fake,torch.zeros_like(predicted_fake))
#print(disc_loss_fake)
predicted_real = disc(real)
disc_loss_real = criterion(predicted_real,torch.ones_like(predicted_real))
#print(disc_loss_real)
disc_loss = (disc_loss_real + disc_loss_fake) * 0.5
#print(disc_loss)
#### END CODE HERE ####
return disc_loss

What am I missing?

Alexander_Karstens · June 5, 2023, 8:00pm

For completeness, here is the whole output---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Input In [30], in <cell line: 74>()
71 break
73 test_disc_reasonable()
—> 74 test_disc_loss()
75 print(“Success!”)

Input In [30], in test_disc_loss(max_tests)
50 disc_opt.zero_grad()
52 # Calculate discriminator loss
—> 53 disc_loss = get_disc_loss(gen, disc, criterion, real, cur_batch_size, z_dim, device)
54 assert (disc_loss - 0.68).abs() < 0.05
56 # Update gradients

Input In [29], in get_disc_loss(gen, disc, criterion, real, num_images, z_dim, device)
34 random_numbers = get_noise(num_images,z_dim, device)
35 #print(random_numbers)
—> 36 fake_images = gen(random_numbers).detach()
37 #print(fake_images)
38 predicted_fake = disc(fake_images)

File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don’t have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
→ 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = ,

Input In [5], in Generator.forward(self, noise)
26 def forward(self, noise):
27 ‘’’
28 Function for completing a forward pass of the generator: Given a noise tensor,
29 returns generated images.
30 Parameters:
31 noise: a noise tensor with dimensions (n_samples, z_dim)
32 ‘’’
—> 33 return self.gen(noise)

File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don’t have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
→ 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = ,

File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/container.py:204, in Sequential.forward(self, input)
202 def forward(self, input):
203 for module in self:
→ 204 input = module(input)
205 return input

File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don’t have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
→ 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = ,

File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/container.py:204, in Sequential.forward(self, input)
202 def forward(self, input):
203 for module in self:
→ 204 input = module(input)
205 return input

File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don’t have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
→ 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = ,

File /usr/local/lib/python3.8/dist-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
113 def forward(self, input: Tensor) → Tensor:
→ 114 return F.linear(input, self.weight, self.bias)

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument mat1 in method wrapper_addmm)

paulinpaloalto · June 5, 2023, 8:35pm

Are you sure you didn’t hard-code the device in one of the supporting functions (e.g. get_noise)? As far as I can see (although the formatting is a bit funny: the </> tool will help with code), your code looks correct and I don’t see any explicit references to the device.

Akash_Pratap_Singh · June 7, 2023, 6:53am

There are two main entities at play here, data and model, and both of them need to be in the same place for the calculation to happen.

By default, things are assigned to cpu and it is possible that either data or the model is getting transferred to cuda but the other one is not, since you are setting device torch.device(device). All you need is to sent the other entity to cuda too, using x.to(device).

Another thing you can do is remove any reference to cuda in the code which will force everything to be on cpu.

paulinpaloalto · June 7, 2023, 2:36pm

As I pointed out in my earlier reply on this thread, the notebooks are set up to run in “cuda” mode when they are running under your control, so that the training works faster. But when the grader runs, it uses “cpu” mode to save money. The code in the notebook is set up so that the device is passed as an argument where required. You should just respect those arguments and not explicitly set the device (“hard-code” it) in any way that is not already present in the provided template and test code.

Alexander_Karstens · June 8, 2023, 12:49pm

It is possible I hard coded it. You know the old saying: I didn’t change anything … I did notice that initially I never could get past the test get noise function. It would always fail trying to pass the device parameter to torch.randn.
At any rate, I just ended up running everything on the CPU.
Fair warning: I am by no means a pytorch expert.

Topic		Replies	Views
Week 1 Assignment: RuntimeError Build Basic Generative Adversarial Networks week-module-1	7	859	February 15, 2022
RuntimeError in C3W2B during training in last cell Apply Generative Adversarial Networks week-module-2	2	588	March 17, 2023
RuntimeError: expected device cuda:0 but got device cpu Apply Generative Adversarial Networks week-module-1	2	667	January 18, 2022
Work with GPU Generative AI with Large Language Models week-module-2	3	493	April 8, 2024
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn Build Basic Generative Adversarial Networks week-module-1	5	1095	November 17, 2021

Expect all tensors to be on the same device

GRADED FUNCTION: get_disc_loss

Related topics