CycleGAN UNQ_C5 validation error

Hi all,

I’m doing the CycleGAN assignment, and the UNQ_C5 doesn’t validate.

I understand that the gen_loss I am supposed to get is 4047804560 and I get 4047804416. I cannot figure out where are the missing 144.

I do not think I messed up the gen_loss computation, as it should be a simple sum weighted by the different lambda.

gen_loss = adv_loss_AB + adv_loss_BA + lambda_identity * id_loss_AB + lambda_identity * id_loss_BA + lambda_cycle * cycle_loss_AB + lambda_cycle * cycle_loss_BA

This is what I did for the computation, just in case you want to check it out.

I still tried training the CycleGAN and it works, which doesn’t surprise me as I’m expecting the loss is just slightly off from what it’s supposed to be.

If you have any hindsight, I would be grateful.

Best regards

That is indeed incredibly close (0.000004% error), but the inputs are not random, so the output should be exact.

I believe your total loss line is correct, but did you check the other losses? You might have mixed up the order on the get_loss functions’ inputs and/or outputs. Please compare their definitions and the way you are using them.

If it helps, here is what you should expect for the “intermediate” losses:

Screenshot from 2021-09-02 23-01-05
[EDIT: first screenshot had cycle_loss_AB wrong]

Good luck! Let us know if you catch the problem

1 Like

Hi @pedrorohde,

Thanks for these numbers, I needed them indeed. It is cycle_loss_BA that’s wrong for me.

cycle_loss_AB is fine though.

cycle_loss_AB = 38803686, cycle_loss_BA = 35603382.

Here is the code for the cycle_loss computation.

    #### START CODE HERE ####
    cycle_X = gen_YX(fake_Y)
    cycle_loss = cycle_criterion(real_X,cycle_X)
    #### END CODE HERE ####

Oops, I’m terrribly sorry, my screenshot was wrong. It should be:

cycle_loss_BA: tensor(38803686)
cycle_loss_AB: tensor(35603382)

So every number I have is correct apparently.

Is my gen_loss sum wrong ?

Your gen_loss works for me. Did you maybe accidentally change the Unit Test?

# UNIT TEST
test_real_A = torch.tensor(97)
test_real_B = torch.tensor(89)
test_gen_AB = lambda x: x * 83
test_gen_BA = lambda x: x * 79
test_disc_A = lambda x: x * 47
test_disc_B = lambda x: x * 43
test_adv_criterion = lambda x, y: x * 73 + y * 71
test_recon_criterion = lambda x, y: (x + y) * 61
test_lambda_identity = 59
test_lambda_cycle = 53
test_res = get_gen_loss(
    test_real_A, 
    test_real_B, 
    test_gen_AB, 
    test_gen_BA, 
    test_disc_A,
    test_disc_B,
    test_adv_criterion, 
    test_recon_criterion, 
    test_recon_criterion, 
    test_lambda_identity, 
    test_lambda_cycle)
assert test_res[0].item() == 4047804560
assert test_res[1].item() == 7031
assert test_res[2].item() == 8051
print("Success!")

No, nothing changed. I did the sum by hand, and I get the right result.

So far, apparently python doesn’t want to sum tensors the way they should be summed.

Edit: Test Unit is fine too

Final update:

Everything works now, even though I didn’t change the code.

I had a device issue with get_gen_adversarial_loss because of the device being passed.
In the end, I had a float-like tensor with a cuda device.
I recasted everything into integers and now the full sum works.

My assumption is there has been a mix up when casting types.

Thank you @pedrorohde for the numbers so I could debug it.

Final edit (hopefully): the INT conversion only works to pass the unit test, it will mess up when actually training cycle GAN.

4 Likes