Week3 : Programming assignment

starboy · August 27, 2021, 9:32am

why do we have to multiply -1 and do mean of crit_fake _pred.

fangyiyu · August 27, 2021, 3:09pm

To recap, the critic is the equivalent of a discriminator, the difference is the critic tries to maximize the distance between its evaluation on a fake and its evaluation on a real, so the output is no longer between 0 and 1 but can be any real number.

For the generator, the loss is calculated by maximizing the critic’s prediction on the generator’s fake images.

The W-loss can be represented by this formula:

and you are now calculating the second half of the formula which is to maximize the critic’s prediction on the generator’s fake images. Since the critic has the scores for all fake images in the batch, you will use the mean of them.

You can refer to the first minute of the video Condition on Wasserstein for more information.

Hope it helps!

Fangyi Yu

pedrorohde · August 27, 2021, 3:25pm

You can think about it this way:

We want to maximize the critic’s scores on the generator’s fake images. Usually, what we do is we minimize a loss function. But, if you think about it, maximizing a value is equivalent to minimizing -value (negative that value).

Taking the mean, as @fangyiyu said, is just to average over all examples in the batch, so as to have a single scaled value for the loss (if we did a sum, it would have different scales for different batch sizes).

sohonjit.ghosh · August 27, 2021, 5:43pm

I would just add in something to the answers given by @fangyiyu and @pedrorohde. The reason we are multiplying by -1 is because in the deep learning framework like Pytorch the optimization works as minimization of the objective function. As such we place the minus sign in front!

shreyasvedpathak · August 28, 2021, 3:39am

As mentioned by @sohonjit.ghosh, in computer science when we want to convert a maximization problem into a minimization problem (like knapsack problem), we multiply the loss/cost function with a minus sign. Your question is one of such cases.

Elemento · September 5, 2021, 2:24pm

Hey @pedrorohde @sohonjit.ghosh,
I have a small doubt regarding this. Just after this function, we are told to calculate the loss for the critic, and it is written that

For the critic, the loss is calculated by maximizing the distance between the critic’s predictions on the real images and the predictions on the fake images while also adding a gradient penalty. The gradient penalty is weighed according to lambda.

In this case, as well, we need to maximize the distance, so why aren’t we minimizing the negative of the expression in this case as well? In other words, why are we using

crit_loss = torch.mean(crit_fake_pred - crit_real_pred + c_lambda * gp)

and not

crit_loss = -1* torch.mean(crit_fake_pred - crit_real_pred + c_lambda * gp)

I have submitted by notebook, and it is showing me an out-of score. Have I done anything wrong?

pedrorohde · September 5, 2021, 11:46pm

Hi @Elemento

Maybe the assignment’s phrasing is a little confusing because the term “distance” between the two could go either way, meaning it could be either crit_fake_pred - crit_real_pred or crit_real_pred - crit_fake_pred.

If you think about it, this is the critic’s loss. We want the critic to predict fake images as being fake (zero) and real images as being real (one). That means we want crit_fake_pred to be very small (minimize it), and crit_real_pred to go up (maximize it).

Putting it in terms of optimization, this leads our minimization loss function to be crit_fake_pred - crit_real_pred, which amounts exactly to minimizing crit_fake_pred and maximizing crit_real_pred.

As for the gradient penalty (c_lambda * gp), it should be minimized: we want the gradient norm to be as close to 1 as possible, and our penalty function measures the distance from the norm to 1. Minimizing it makes the gradient norm get closer to 1.

Hope that made sense for you. Cheers

Elemento · September 6, 2021, 6:12am

Thanks a lot @pedrorohde, it makes complete sense to me now!

Topic		Replies	Views
Course 1 Week3: Programming Assignment Generator Loss Build Basic Generative Adversarial Networks week-module-3	3	43	July 12, 2024
Confused by WGAN C1W3 assignment UNQ_C4 Build Basic Generative Adversarial Networks week-module-1	3	484	June 5, 2023
Training process is not doing well Build Basic Generative Adversarial Networks week-module-3	5	353	April 16, 2022
Confusion with WGAN-GP Loss equation for the Critic Build Basic Generative Adversarial Networks week-module-3	5	245	September 29, 2023
Critic Loss Function Build Basic Generative Adversarial Networks week-module-3	3	416	January 19, 2023

Week3 : Programming assignment

Related topics