Questions about paper from GAN to WGAN by Lilian Weng

Dennis_Sinitsky · March 24, 2024, 7:08pm

Hello,
I am reading paper “From GAN to WGAN” and want to ask questions about it in this thread.
After posting Q1, I will post Q2, 3, 4 etc also in this thread, as those questions come up.
Q1. this is a table on page 3.

what is “data x”? Where is in in GAN? Is it output of generator G? Also, the table calls “over data x” for p_g, and “over real sample x” for p_r – how is this consistent?
Thank you

paulinpaloalto · March 24, 2024, 7:22pm

I have not read that paper, so I’m just guessing, but I’d say that p_g is the distribution of the output of the generator. That’s my guess as to what they mean by x in that context.

Note that is consistent with how they describe p_r: the distribution of the real samples x.

The inputs to the discriminator are x and they either come from real samples or generated samples.

Dennis_Sinitsky · March 24, 2024, 10:35pm

so x is input to discriminator, whether it is output of the generator or real image…

Dennis_Sinitsky · March 27, 2024, 4:06am

Why in definition of Wasserstein formula we use inf, not min? I hope someone could explain.
Thanks!

paulinpaloalto · March 27, 2024, 4:52am

This is a “pure math” thing. Unless you have gotten to about junior year as a math major and taken a course in Real Analysis (what comes after calculus and multivariate calculus), this is “over the top”. Do you know the difference between an “open” and a “closed” set?

The point is that in an open set, the minimum is the limit of values in the set, but it’s not actually an element of the set. That’s why they use the term infimum: to handle that distinction.

Don’t worry about it.

Deepti_Prasad · March 27, 2024, 9:12am

data x is the real examples used to create discriminator D which estimates the probability of a given sample coming from the real dataset.

I am not sure what you are asking here?

A generator G outputs synthetic samples given a noise variable input z (z brings in potential output diversity). It is trained to capture the real data distribution so that its generative samples can be as real as possible, or in other words, can trick the discriminator to offer a high probability.

So that means no data x is not output of generator G but input z.

p_g ==> being generator’s distribution over data x is capturing real data distribution to generative samples can be as real as possible, or in other words, can trick the discriminator to offer a high probability.

p_r ==>data distribution over real sample x ==> this data distribution is from the discriminator over real sample x

So it is basically p_z creating a real product using fake samples which p_g then generates the data distribution over real dataset to improve the generator probability to create a better real product using fake samples which again the discriminator is challenging the generator by creating a more better real samples when comparing with the supervised real samples created using supervised fake samples.

Basically here discriminator is the quality checker and generator is trying to create more high quality samples to fool discriminator, there by also improving discriminator in catching issues with generators supervised fake-real samples

Deepti_Prasad · March 27, 2024, 9:31am

The main reason of Wasserstein formula to use infimum is because it measures of how far the prediction of the critic for the real is from its prediction on the fake where as BCE loss Loss measure that distance between fake or a real, but to a ground truth of 1 or 0. So basically the discriminator is bounded between 0 and 1 in BCE loss but in W-loss approximates the Earth Mover’s Distance between the real and generated distribution, hence use of greatest lower bound in calculating smallest cost is justified.

There are two videos in the same week explaining your doubt. Kindly go through Wasserstein Loss and Condition on Wasserstein Critic. that should explain the reason of choosing infimum and not minimum.

Regards
DP

paulinpaloalto · March 27, 2024, 3:13pm

To give a concrete example of the point here, consider the following set:

\{x \in \mathbb{R}: 0 < x < 1\}

What is the minimum value in that set? There is none, right? If you pick any element of that set that is “close” to 0, there are an uncountably infinite number of other values in that set that are closer to 0. You want to say 0, but 0 is not an element of that set.

So we need to call 0 something different than the minimum of the set: it is the “greatest lower bound” or “infimum” of that set. They are basically inventing a new word (but derived from Latin, of course) to represent that concept. By analogy, we would say that 1 is the “supremum” of that set (the least upper bound of the set).

You would certainly be forgiven for thinking that this is not an important or useful distinction, but mathematicians have parties about stuff like this.

Dennis_Sinitsky · March 27, 2024, 9:08pm

Thanks, Paul. I understand now.

Topic		Replies	Views
C1_W3_WGAN-GP_Assignment Negative Loss for Generator & Critic Build Basic Generative Adversarial Networks week-3	2	318	February 19, 2024
Why is the Generator Loss in WGAN negative mean of the predicted image Build Basic Generative Adversarial Networks week-3	4	472	September 26, 2022
Confusion with WGAN-GP Loss equation for the Critic Build Basic Generative Adversarial Networks week-3	5	204	September 29, 2023
Basic question in Build basic generative Adversial network C1W1 and C2W2 assignments Build Basic Generative Adversarial Networks week-1 , week-2	6	423	January 23, 2024
Condition on Wasserstein Critic - Week 3: Wasserstein GANs with Gradient Build Basic Generative Adversarial Networks week-3	2	270	February 21, 2024

Questions about paper from GAN to WGAN by Lilian Weng

Related topics