Update: I found that @Elemento actually answered my question, because @Xiaojian_Deng made the same mistake I did. @Elemento’s answer is here. The gist of his answer is:
Now, though we want to call detach on the fake images when we are updating the discriminator (since we don’t want to update the generator in this case), we don’t want the same thing to happen when we are updating the generator. And hence, when you changed the position of detach method, the generator didn’t update at all, and hence, led to empty squares.
Thank you, @gent.spah, for the checklist! I found the problem, and I believe it is related to your checklist #1. The problem was that I had placed the “.detach()” in the wrong place:
This shows my ignorance of how to use the .detach() method. The detach() docs say
Returns a new Tensor, detached from the current graph.
The result will never require gradient.
but the docs don’t show examples on where it should be placed.
For where to place the .detachI(), in Week 4’s lab, I was following the pattern of the Week 1 lab:
Why is Week 4 lab different, and why does placing the .detach() in the wrong location in the Week 4 lab cause the training to fail?
BTW, this conversation Why should we detach the discriminators input ?! is very relevant, but after reading it through I don’t think it answers my question of where to put the .detach() method.
Thanks,
Steve