Is there any particular reason we go back to BCE loss for the programming exercise on cGANS? I understand mode collapse is no longer an issue, but vanishing gradient can still be an issue? Is this done mainly for speed (as training is slower with wasserstein loss)?
Hey @utmital,
Welcome to the community! That’s a nice question. I would also like to know the answer to this indeed. I guess this might be due to the reason that we are looking for mode collapse in conditional GANs, since we want our GANs to produce images of a single class only as per the class encoding passed to the generator, and using W-loss, will prevent mode collapse. But as you said, it could be in order to avoid the extra computation as well, so that the training time can be reduced.
But let me point out, these are just my thoughts, and they might be incorrect!