Pix2Pix Assignment issues

Hi everyone,

I’ve encountered 2 problems while working on this assignment.

The first one is the Dead Kernel issue, as encountered here: Dead Kernel warning in Pix2Pix Assignment - #3 by 28utkarsh and fixed it the same way by setting pretrain to false.

The second one is: RuntimeError: expected device cpu but got device cuda:0

I get this line when I try to train the Pix2Pix. If I change the device to cpu, the error disappear but the training seems to freeze from the beginning.

What should I do ?

Follow up: still submitted the code and it worked.

Hello @Barb,

It’s great that you were able to tackle your first problem. Concerning your second issue, when you switch to CPU, the training takes a long time because:

  • GANs are computationally expensive
  • CPUs are slow, right?

Perhaps it is not freezing as you claim, but rather training at a very slow rate.

Since the assignment you’re referring to doesn’t need you to submit any files, your assignment was accepted as long as the code was valid (logically, syntactically, etc.).

Hope it helps :grinning_face_with_smiling_eyes:

1 Like

I had a sort of red broken chain ring icon next to the cell, that’s what I assumed.

Thanks for the update

Hi @Barb

Firstly, thanks for asking this doubt because it has got me an idea to solve this issue.

Just add the parameter map_location = device while you are loading the pre-trained checkpoint. Also, make sure that you have set the device to cuda in the training preparation step. Check Line#16 for setting the map_location parameter.

Secondly, I agree with the point of @shreyasvedpathak regarding the slow rate of training of a model using CPU.

Let us know if you face any other problem. :slight_smile:

1 Like

Hi @28utkarsh

Just put back cuda for the device and the second error stills appears when using cuda.

Here is a screenshot of the error

Hi @Barb

While calculating the adversarial loss, the ones tensor is getting created in CPU using statement torch.ones(pred.shape). You need to send it to the required device while training the model.

So, replace that adversarial loss calculation statement with the following statement:

adv_loss = adv_criterion(pred, torch.ones(pred.shape).to(device))
2 Likes

It’s working now. I had to use the .to(device) also on the prediction.

2 Likes

Congratulations @Barb, on completing your assignment successfully.