Level of diversity within a class versus bias

Ciao a tutti,

I would be interested in the comments or thoughts of my classmates and the TAs regarding the following.

It seems to me that StyleGen by its nature is biased and that the truncation method may also introduce additional a further bias - though on a small scale - when there is a relatively high level of diversity within one of the classes.
And when this is the case, there is in fact a trade-off between fidelity and bias.

If we consider the diagram from the “Improved Precision and Recall Metric for Assessing Generative Models paper” of week one, and add in some indication of a very diverse class (represented by the yellow x symbols) spread around Pr distribution, then we can probably infer that this yellow x class may be underrepresented in the generated images and this underrepresentation could increase further when the truncation is applied and the Pg distribution shrinks.

Regarding real world issues, maybe we need to recall that there is more genetic diversity within the population of African origin than the total genetic diversity of the rest of the world.

Maybe this means that in order to avoid bias, not only do we need to make sure that the number of data points of each of the classes in the training set reflect the actual world, but there also needs to be a consideration for the relative diversity of each class distribution.

A presto,
David

@David_HAIN, good point that not only do we need to consider the number of examples from each class, but also the diversity of those examples! :heart: