Question about C2W2 optional notebook GAN Debiasing

In the aforementioned notebook there is a description about the characteristics function of the synthetic data:


The notation of this function is a little bit strange for me. Does it mean the synthetic samples with target label and those with protected label are mutual exclusive? But if so, in my opinion this equation also holds if the two sets of samples are fully overlapped. Where am I thinking it wrong?

The goal of this approach is to de-correlate target and protected labels using GANs. As the example says, we want to de-correlate the presence of a hat from the presence of sunglasses. To this end, we can generate a new dataset where these labels are not present on the same image. The notation shows that the probability of getting hats is independent of sunglasses: If A does not depend on B, then P(A|B)=P(A).

Otherwise, the presence of sunglasses may indirectly mean that there is a hat too, which is bad if we want to exclude protected attributes such as gender and race.

Thank you~ My statistics is too rusty. Better to review it.