I am confused a bit. In course 1 of the specialization, in week 3, the professor says that it does not make sense to use linear functions as activation functions. I completely could understand the justifications of that. But it seems that aslo RELU is a linear function, then why is it suitable as activation function?
Hi @Sina_Kian ,
Relu is a non-linear activation function. The formula is f(x) = max(0,x), outputting the input value as it is if the input is positive, otherwise, 0.
In addition to @Kic‘s great answer, I believe your question is answered in this thread: Differences between ReLU and linear for positive values - #3 by Christian_Simonis
Please let us know if anything is unclear, @Sina_Kian.
I know technically Relu is not a linear function. But the effect of this function is exactly the same with the effect of linear function.
I’ll try to look at the material that offered Christian_Simonis.
Both the ON (linear) state and the OFF state of a neuron are being judicioulsy used by the NN to meet a specific goal - So, let’s not get fooled into thinking that the OFF state can be ignored.
The NN while utilizing the linear part of the ReLU, also needs to control when the OFF state should exactly happen so that each neuron can work harmoniously with the other neurons to achieve the final output.