Need of reLU as activation function

yashbansal231 · October 23, 2025, 7:24am

Similar to linear func when used in hidden layer, it just changes it to linear layer later and does not serve any purpose.
Isnt it the same with relu if all the input values are always positive?

balaji.ambresh · October 23, 2025, 10:06am

Non linearity comes into play when output of the affine function (i.e. wx + b) is negative. Please keep in mind that weights and bias can be any real number.

As you rightly observed, for non-negative output of the affine function, both linear and relu activations are the same.

Gamerboy105080 · October 24, 2025, 3:40pm

i still do not understand the need of relu ? compaired to tanh h and sigmoid

balaji.ambresh · October 24, 2025, 6:46pm

The advantage ReLU has over tanh and sigmoid is that it’s a lot faster to compute.

Please read this page on why ReLU is a good choice in the Advantages section. That said, 1 problem that’s woth noting in the same page is Dying ReLU. A variant of the ReLU function called Leaky ReLU can be used to mitigate this issue.

I don’t think anyone uses tanh as an activation function in intermediate dense layers since ReLU (and its variants) are the default choice nowadays.

Topic		Replies	Views
Activation functions Convolutional Neural Networks coursera-platform	3	744	January 4, 2023
Choice of activation function Advanced Learning Algorithms week-module-2	7	706	November 21, 2022
Activation function in NN NLP with Classification and Vector Spaces week-module-3	3	359	March 30, 2022
ReLU function as activation function Advanced Learning Algorithms week-module-2	3	426	July 11, 2023
Neural Network functions Advanced Learning Algorithms week-module-2	3	498	April 22, 2023

Need of reLU as activation function

Related topics