Before submitting images to the grader for the GAN Hand assignment, I first saw what my own discriminator thought. I got the following result:
<tf.Tensor: shape=(16, 1), dtype=float32, numpy=
array([
[7.9994911e-01],
[9.9239588e-01],
[9.4112718e-01],
[7.7543765e-01],
[9.4984710e-01],
[7.8973532e-01],
[3.0495387e-01],
[3.8384840e-01],
[5.7092643e-01],
[3.2548344e-01],
[5.2061573e-02],
[1.8261197e-01],
[1.8240607e-04],
[4.3043430e-05],
[9.8573253e-02],
[7.2770426e-04]], dtype=float32)>
I was not anticipating a pass, since I had only 6 images above the required threshold a few more that were at least within a factor of 2 and then 4 I just randomly placed in at the end to get 16 images. However, mirabile dictu, I passed with these results:
score for image hand0.png: 0.894
score for image hand1.png: 0.595
score for image hand2.png: 0.797
score for image hand3.png: 0.713
score for image hand4.png: 0.375
score for image hand5.png: 0.729
score for image hand6.png: 0.325
score for image hand7.png: 0.428
score for image hand8.png: 0.813
score for image hand9.png: 0.739
score for image hand10.png: 0.302
score for image hand11.png: 0.382
score for image hand12.png: 0.746
score for image hand13.png: 0.506
score for image hand14.png: 0.317
score for image hand15.png: 0.753
Where of my âsolid 6â, only 4 passed. Luckily, 2 from my âclosishâ group passed. But what was truly surprising was that two images that my discriminator rejected with greater than 99% accuracy, were accepted by the grader.
I suspect this shows the sensitivity of the discriminator to the exact series of noisily generated images it trains with. I also suspect, that the best way to âpassâ is to take advantage of the ability to submit as many times as one likes. Just submit all 31 images (or more if you save âgoodâ images during training), see if there is some set of 8 images that the grader has >60% confidence in, and make a final submission including those images.
The behavior of the grader is correlated with what looks good to eye or oneâs discriminator, but IMHO not so strongly correlated that one can rely upon either one.