Questions for GAN to generate MNIST digits


While I completed the programming assignment for Week 1, I have some lingering questions about a few aspects. Your help is greatly appreciated!

  1. Since the discriminator’s output has only 1 neuron/node, I just want to confirm the discriminator is doing a binary classification/logistical regression of whether an image is merely a digit BUT without further classifying the image is a “0”, “1”, … “9”. This is consistent with the example of generating an image of a cat but the noise vector is the “seed” for what type/breed of cat will be generated

  2. In the optional part of the programming assignment, I trained the GAN for 90K+ steps and got the following images from the generator. Are they similar to what others have gotten from using the default setup for batch size and # epochs? I am trying to figure out if these images are within expectations

  3. Regarding the comment on one would want the generator and the discriminator to be always similar in capabilities, the top of the image below also captured the generator’s loss ended up being ~2.5x of the discriminator’s loss. Is 2.5x considered as a sizable imbalance between the 2 loss metrics? If so, has anyone considered active monitoring the imbalance between the 2 loss metrics during training, and when an imbalance arises dynamically invoke more training for the block with higher loss until the imbalance is gone? I admit this will make the training non-deterministic

  4. For training the generator, instead of using deterministic #'s for batch size and # epochs, has anyone considered simply keep training until the loss drops below a threshold? Probably still need to OR this with # epochs but the # epochs is set to a very high value as a safety net

Not quite. The point is the discriminator’s function is to decide whether the input image is fake or real. It doesn’t need to classify it by which digit it is. That’s why it only needs one output neuron, which is the output of sigmoid and is interpreted as “yes” or “no” meaning “real” or “fake”.

Yes, your generated images look pretty similar to the quality I saw when running the training here. You can try increasing the number of epochs and see how they continue to evolve.

For your questions 3) and 4), those are interesting and creative ideas to try to dynamically adjust the training process to maintain the balance between the discriminator and the generator. I’m not any kind of expert on GANs and don’t know anything more than what I learned in the courses here, so I’m not aware of the SOTA and whether there has been research into strategies like that. Maybe we’ll get lucky and someone more knowledgable than me will notice and comment.