A question about WGAN's objective function

Jack_Changfan · April 5, 2022, 5:25am

For this function given in the lecture:

When maximizing for critic score, wouldn’t that tend to get larger violation (unless lambda is negative)?
Actually, in the assignment, it hints higher gradient penalty, higher critic loss. I assume it means:
crit_loss = -crit_score = torch.mean(crit_fake_pred)-torch.mean(crit_real_pred) + c_lambda * gp
If this is true, then the sign of the gradient penalty term in the lecture should be negative, right?

Also, in the assignment, the generator loss is
gen_loss = -torch.mean(crit_fake_pred)
which doesn’t rely on the gradient penalty. I think this is because when training generator, all critic’s parameters are fixed, so gradient penalty on critic’s parameters are fixed at 0, right?

Wendy · April 5, 2022, 11:50pm

@Jack_Changfan, you are exactly right - lambda would need to be negative in the equation shown in the lecture.
The thinking at the time they put the course together was that the top priority for the lectures was to get the concepts across and leave the implementation details to the assignments. In this case, they wanted to present the basic concept that “With the gradient penalty, all you need to do is add a regularization term to your loss function.” The thinking was that lambda is a variable, so theoretically could be negative, and it might muddle the concept if the lecture spent time focusing on the sign of the gradient penalty term.

Akanksha_Paul · September 28, 2022, 4:18am

gen_loss = -torch.mean(crit_fake_pred)

but it should rely on gradient penalty term right? As critic is passed x_hat which is a linear interpolation of fake and true samples. And while training the generator, fake samples will vary. So, the critic’s x_hat will vary.

Not sure if my understanding is right. It would be great if someone can help me understand why gradient penalty term should not be part of generator loss.

Wendy · September 29, 2022, 11:10pm

@Akanksha_Paul, remember from the videos that the 1-L continuity condition that we’re trying to address with the gradient penalty is a condition on the critic only.

As far as why this is the case, for me, the easiest way to think about this is to remember that the critic needs to consider both its predictions for fake images as well as its predictions for real images with the goal of pushing these distributions farther apart from each other. This is exactly the situation where we need the extra condition to encourage 1-L continuity.

The generator, on the other hand, really only needs to consider the critic’s predictions on its fake images. Its goal is to fool the critic with its fake images - the higher the value, the more real the critic thinks the generator’s image is, which is exactly what the generator wants.

If you’re interested, there’s a little more discussion about intuitive differences between WGAN generator and critic losses in this post: Why is the Generator Loss in WGAN negative mean of the predicted image

If you want to go deeper into exactly why this works the way it does, there’s a link to the official Wasserstein GAN paper here: Build Basic GANS Week 3 Works Cited

Tim_Crnkovic · December 11, 2022, 5:46am

I’m confused about the computation of the Wasserstein critic’s loss with regard to the prediction of Real examples.

With BCE, we measure how far the prediction of Real is from 1, which is the label for Real.

But for Wasserstein, there is no upper limit for the prediction, no label to measure against. If there is no upper limit, how can we measure how “wrong” the prediction of a Real example is? Is a prediction of 10 “worse” than a prediction of 20 (for a Real).

I’d like to point out that how to calculate c(x) was not explained in the videos or in the slides, hence my question.

Topic		Replies	Views
C1_W3_WGAN-GP_Assignment Negative Loss for Generator & Critic Build Basic Generative Adversarial Networks week-3	2	318	February 19, 2024
WGAN assignment UNQ_C4 Build Basic Generative Adversarial Networks week-3	4	322	January 16, 2024
Confusion with WGAN-GP Loss equation for the Critic Build Basic Generative Adversarial Networks week-3	5	203	September 29, 2023
Training process is not doing well Build Basic Generative Adversarial Networks week-3	5	350	April 16, 2022
Critic Loss Function Build Basic Generative Adversarial Networks week-3	3	414	January 19, 2023

A question about WGAN's objective function

Related topics