Is this topic leaky Relu will be covered detailed in upcoming course?
Around lecture video from 8:18, the below statement does it mean like we need to choose paramter for ReLu function based on which paramter provides good accuracy , choose it. Does the below statement meaning it ?
And you might say, why is that constant 0.01? Well, you can also make that another parameter of the learning algorithm.
Can someone please make to understand this doubt ? Is it always fixed 0.01 value for ReLU Activation function or should we use cross validation to find best value ?
max(0.01 * z, z)
Hi, 0.01 is not a hyperparameter/something we change a lot because it is not really that important. Leaky-ReLU is not seen very frequently in practice either. The 0.01 is just so that we don’t completely remove the negative value if that is something that would be important to the scenario one is working on. But for most cases, regular ReLU works wonderfully well, and even in the leaky relu variation, changing the constant does not make a big difference to performance, hence it is not considered that important. There is no need to make it a parameter of the learning algorithm.