Why Gen uses ReLU while Disc uses LeakyReLU?

How about both using either ReLU or LeakyReLU?

Why don’t you try that and see what happens? The point is that the choice of activation functions is a “hyperparameter”, meaning a choice that you need to make as the system designer. The choice depends on the circumstances, i.e. what works. So when you see a case like this, your default assumption should be that they tried using ReLU in the second case and it didn’t work very well.

Here’s a thread over in DLS that talks about the natural hierarchy of activation functions and the order in which you try them.

Thank you very much!