Activation functions

Christian_Simonis · January 1, 2023, 11:37am

Hi there,

benefits are:

there is a reduced risk of vanishing gradients since the gradient in the positive section of the ReLU function is constant. It does not saturate in contrast to sigmoid or tanh
you can describe well non-linearity as stated in this thread: Isn't Relu just a lineer regression function for z>=0 - #6 by Christian_Simonis With a pure linear activation function this would not be possible.
ReLU is easy to compute with y = max(0,x) and therefore it is often faster compared to other alternatives

Please let me know if this answers your question!

Best regards and happy new year!

Christian

Topic		Replies	Views
DL and NN course1 Week#3: Understanding Activation functions Neural Networks and Deep Learning week-3	2	29	March 4, 2025
Choice of activation function Advanced Learning Algorithms week-2	7	681	November 21, 2022
Activation functions in the hidden layers Advanced Learning Algorithms week-2	4	510	July 21, 2022
ReLU function as activation function Advanced Learning Algorithms week-2	3	421	July 11, 2023
ReLU activation function Neural Networks and Deep Learning	8	821	May 2, 2021