C1_M4 Assignment issue - Expected all tensors to be on the same device

zoilorys · January 4, 2026, 12:12am

Hello everyone. Encountered error with C1_M4 3.2 assignment. The task is very clear there, a pretty regular training loop, that I’ve done many times before, but here specifically loss function throws an error:
```
File /usr/local/lib/python3.12/site-packages/torch/nn/functional.py:3494, in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
3492 if size_average is not None or reduce is not None:
3493 reduction = _Reduction.legacy_get_string(size_average, reduce)
→ 3494 return torch._C._nn.cross_entropy_loss(
3495 input,
3496 target,
3497 weight,
3498 _Reduction.get_enum(reduction),
3499 ignore_index,
3500 label_smoothing,
3501 )

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument target in method wrapper_CUDA_nll_loss_forward)
```
I have double checked the device where model, output and labels are, all on cuda, but I still get this error.
I tried to override device to be CPU, to avoid this error, and I started getting following error:
```
File /usr/local/lib/python3.12/site-packages/torch/autograd/graph.py:823, in _engine_run_backward(t_outputs, *args, **kwargs)
821 unregister_hooks = _register_logging_hooks_on_whole_graph(t_outputs)
822 try:
→ 823 return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
824 t_outputs, *args, **kwargs
825 ) # Calls into the C++ engine to run the backward pass
826 finally:
827 if attach_logging_hooks:

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.
```
even though I am doing optimizer.zero_grad() at the start of each for loop iteration.
Am I doing something wrong?
Basically the code I have is really basic (which makes it even more frustrating)
```

{moderator edit - solution code removed}

Like it doesn’t get simpler than this

paulinpaloalto · January 4, 2026, 1:53am

You must not be handling the device assignment correctly for some of your tensors. They discussed this in the lectures and the coding examples here in Course 1, although I don’t remember offhand exactly where those discussions were.

I just took a look and went through my copy of this notebook and they give you the logic in the template code to copy the images and labels to the target device. Did you perhaps modify some of that code?

In general, it’s not necessary to modify anything outside the “YOUR CODE HERE” sections. You can do that, but you need to be careful that you know what you are doing when you “go there”.

paulinpaloalto · January 4, 2026, 2:01am

The code you showed looks correct to me, so the problem must be elsewhere.

And I assume you mean that you see this error when you run the training in the notebook, not when you submit to the grader, right?

Deepti_Prasad · January 4, 2026, 6:24am

hey @zoilorys

the mistake is while calculating the loss you are suppose to use outputs but you have used output while calculating the loss.

Regards

DP

paulinpaloalto · January 4, 2026, 3:47pm

Thanks, Deepti! I thought I looked at that code, but obviously not closely enough.

zoilorys · January 4, 2026, 4:50pm

@paulinpaloalto , I double checked the device handling, so it was okay, but I didn’t catch the typo, @Deepti_Prasad , thank you so much!!!
Too bad notebooks don’t have proper LSP in them

Deepti_Prasad · January 4, 2026, 8:03pm

@zoilorys

your error log did give me hint when it kept pointing t_outputs (kwargs, args)

so I thought while recalling the model step there might be mistake of passing input but then the next line clearly showed the previously recalled outputs wasn’t used to calculate loss.

We all have been there, sometimes even I have wasted days to week to only find my silly mistake later.

Good luck

DP

Thank you @paulinpaloalto

Topic		Replies	Views
Expect all tensors to be on the same device Build Basic Generative Adversarial Networks week-module-1	5	696	June 8, 2023
RuntimeError in C3W2B during training in last cell Apply Generative Adversarial Networks week-module-2	2	597	March 17, 2023
Work with GPU Generative AI with Large Language Models week-module-2	3	512	April 8, 2024
Calculated loss must be on the original device: cuda:0 but device in use is cpu Finetuning Large Language Models	1	79	December 5, 2024
C1M4 Stuck on verify_training_process. RuntimeError: CUDA error: device-side assert triggered PyTorch: Fundamentals week-module-4 , dl-ai-learning-platform , coursera-platform	2	32	January 18, 2026

C1_M4 Assignment issue - Expected all tensors to be on the same device

Related topics