If I understand correctly the c_criterion is the mean cross entropy loss of the approximation over the generator’s images

But i can’t tie this definition to the implementation in the notebook:

c_criterion = lambda c_true, mean, logvar: Normal(mean, logvar.exp()).log_prob(c_true).mean()

Hi @kopatko.anna99, welcome to the community!

Which notebook are you referring to? I’m not noticing a `c_criterion`

in either of the two assignments in Course 1, Week 4 (Conditional GAN and Controllable Generation).

Hey @Wendy, I guess we are talking about InfoGAN according to the title. @kopatko.anna99 To be honest, I did go through this notebook, and some of these things went over my head, but now you ask this questiozn, at least I can relate the 2 things in my head. Here’s what is in my head.

The first image above shows the mathematical formulation of the `mean cross entropy loss of the approximation over the generator's images`

. The second image above shows the code for computing the `c_criterion`

, and the third image above shows how we use it in our code.

Here, in the first image, the **inner term represents the cross entropy loss**, and the **outer expectation represents the mean** that we have to consider. In the code (second image), `.mean()`

represents this outer expectation term. Now, we need `c'|x`

where `c'`

is from `P(c,x)`

, i.e., the probability distribution of labels in accordance to the images generated by GAN, and that’s what the `c_labels`

(third image) are, since ultimately, in our flow of code, we decide the class labels first, and then generate the images accordingly. And now, the rest of the formulation fits in pretty well. We feed in `disc_q_mean`

and `disc_q_logvar`

(third image), to construct a **Multivariate Normal Distribution**. Since the variance is in log scale, we take it’s exponential, `logvar.exp()`

. Once we construct the multivariate normal distribution, we use the `log_prob`

function to evaluate the log of the PDF at these `c_labels`

values. In order to know more about why we use the `log_prob`

function and why not simply `log`

, refer to this thread, since it stumped me as well. And Voila, the code (second image) aligns perfectly with the mathematical formulation (first image).

In case I have left something out or misinterpreted something, I am sure Wendy will take care of that.

Regards,

Elemento