Activation functions

Rajendra_Ambati1 · January 1, 2023, 9:37am

Why we use relu activation function in the hidden layers comapre to other activation functions like sigmoid,linear activation,tanh etc.,

Christian_Simonis · January 1, 2023, 11:37am

Hi there,

benefits are:

there is a reduced risk of vanishing gradients since the gradient in the positive section of the ReLU function is constant. It does not saturate in contrast to sigmoid or tanh
you can describe well non-linearity as stated in this thread: Isn't Relu just a lineer regression function for z>=0 - #6 by Christian_Simonis With a pure linear activation function this would not be possible.
ReLU is easy to compute with y = max(0,x) and therefore it is often faster compared to other alternatives

Please let me know if this answers your question!

Best regards and happy new year!

Christian

Christian_Simonis · January 1, 2023, 12:05pm

In case you are interested in reading more information, also with respect to evaluation of different activation functions, feel free to take a look at this paper: https://arxiv.org/pdf/2109.14545.pdf

Best regards
Christian

Christian_Simonis · January 4, 2023, 8:16am

These threads may be interesting, too:

Best regards
Christian

Topic		Replies	Views
DL and NN course1 Week#3: Understanding Activation functions Neural Networks and Deep Learning week-3	2	29	March 4, 2025
Choice of activation function Advanced Learning Algorithms week-2	7	681	November 21, 2022
Activation functions in the hidden layers Advanced Learning Algorithms week-2	4	510	July 21, 2022
ReLU function as activation function Advanced Learning Algorithms week-2	3	421	July 11, 2023
ReLU activation function Neural Networks and Deep Learning	8	821	May 2, 2021

Activation functions

Related topics