Training process is not doing well

MarcosMM · April 13, 2022, 8:19am

My problem is that I have passed all the code blocks but when I run the training, no numbers are generated, just noise.

I don’t know what to check… some advice?

P.S.: are these lines of code correct?

gen_loss = -1 * torch.mean(crit_fake_pred)
crit_loss = torch.mean((crit_real_pred - crit_fake_pred) + (gp*c_lambda))

Kic · April 13, 2022, 9:29am

crit_real_pred and crit_fake_pred are the scores of the batch of images. When calculating the crit_loss, the mean should be taken on just these two arguments. The gradient penalty, calculated by gp*c_lambad, is then added to the mean.

MarcosMM · April 13, 2022, 9:58am

Hi @Kic

Thank you for your help. I have fixed that line of code but now I get this new error:

If I type "crit_fake_pred - crit_real_pred " instead of “crit_real_pred - crit_fake_pred” the unit test gives it as valid. Another discussion says that the correct order is “crit_real_pred - crit_fake_pred”. I’m confused, could you explain the meaning of this, please?

The other discussion: Critic Loss Function

MarcosMM · April 13, 2022, 10:58am

Ok,

I believe that I didn’t correct the line properly. I reckon that this would be like:
mean(real) - mean(fake) + gp*lambda

Also, I am still confused about the order of real_pred and fake_pred…

Kic · April 14, 2022, 2:16pm

Hi @MarcosMM

My understanding of the W-Loss is that the critic is given real and flake images to predict, forming two distributions; one for predicting real images, and the other is for predicting flake images. The further these two distributions is apart, the better the critic’s ability in getting the prediction correct for each of these two categories. So I am not sure what you meant by the order of the real_pred and fake_pred.

Wendy · April 16, 2022, 12:37am

@MarcosMM, to follow up a little more on the code implementation part of your question:

As @Kic explains, the main concept is that the critic wants to maximize the difference between its predictions for real and fake. In other words, when you’re calculating the critic’s loss, the farther apart the real and fake predictions are, the smaller the loss should be.

So, if we know the critic wants to maximize crit_real_pred - crit_fake_pred, what does that mean our implementation for the critic’s loss should look like?

Another thing to keep in mind is that the larger the critic’s prediction, the more confident that means the critic is that an image is real. I think these optional hints in the assignment do a good job of using that concept to give some intuition about what we should be subtracting and what we should be adding:

The higher the mean fake score, the higher the critic’s loss is.
What does this suggest about the mean real score?

Another thing I like to do when I’m trying to sanity-check an approach is to try out a couple of simple examples to make sure they make sense. For example, if real_pred = 5 and fake_pred = 1, vs real_pred = 20 and fake_pred = 1, does my approach for calculating critic’s loss give a smaller loss for the real_pred = 20 case, as it should?

I hope this helps

Topic		Replies	Views
A question about WGAN's objective function Build Basic Generative Adversarial Networks week-3	4	390	December 11, 2022
Confused by WGAN C1W3 assignment UNQ_C4 Build Basic Generative Adversarial Networks week-1	3	481	June 5, 2023
WGAN assignment UNQ_C4 Build Basic Generative Adversarial Networks week-3	4	322	January 16, 2024
Critic Loss Function Build Basic Generative Adversarial Networks week-3	3	414	January 19, 2023
Week3 : Programming assignment Build Basic Generative Adversarial Networks week-3	8	495	September 6, 2021

Training process is not doing well

Related topics