This is the function max(0.01z , z) for leaky RELU. Here 0.01 value is fixed one always or do we need to try again cross validation set ?
Because In Activation Function lecture video Prof Andrew Ng told like And you might say, why is that constant 0.01? Well, you can also make that another parameter of the learning algorithm.
So another parameter of the learning algorithm . what does this statement meaning?
Hi @Anbu, I think Andrew Ng meant another hyperparameter of the learning algorithm, meaning that you could try different values for that in order to check which one works best for your particular case during the training phase.
Note that different deep learning frameworks use different default values for LeakyReLU, in the case of Pytorch for example the default value is 0.01 in Tensorflow the value is 0.3. Both frameworks allow you to specify the value manually so you could treat that as a hyperparameter if you want to play with it.
I notice in the TensorFlow documentation that both relu and leakyrelu accept a parameter alpha. Per the documentation, the default for relu is alpha=0.0 while the default for leakyrelu is alpha=0.3
My reading of the parameter explanation suggests that using relu with alpha=0.3 is the same as using leakyrelu with alpha=0.3 (the default). Am I missing something? Why have both if you can just set the parameter in relu to accomplish the same thing?
Here are the relevant passages:
relu: With default values, this returns the standard ReLU activation: max(x, 0) , the element-wise maximum of 0 and the input tensor.
Modifying default parameters allows you to use non-zero thresholds, change the max value of the activation, and to use a non-zero multiple of the input for values below the threshold.
From a brief hunt through the source code on github, it looks like under the covers leaky_relu and relu end up calling the same lower level code. They are both just wrappers that pass through different parameters/defaults.