Activation functions in the hidden layers

bharathithal · July 21, 2022, 3:15pm

My question is :

If we use ReLU in all the hidden layers and sigmoid for the output layer, wouldn’t that be almost like using a normal sigmoid activation function without ANN?

I understood how Linear Activation function in the hidden layers with sigmoid in the output layer would make using an ANN pointless and not to mention waste of resources as mentioned by Prof. Ng.

Wouldn’t replacing Linear activation function with ReLU have the same effect as the former (at least in some cases) ?

Thanks for the response in advance!

TMosh · July 21, 2022, 3:42pm

@bharathithal, welcome to the community.

The key characteristic of a neural network hidden layer is that it must have a non-linear activation function.

An advantage of ReLU is that it is extremely easy to compute compared to the sigmoid.

A disadvantage is that the output is zero for all negative inputs - so nothing is learned from the magnitude of a negative value.

shanup · July 21, 2022, 5:15pm

Hello @bharathithal

I believe this confusion stems from the thought that the ReLU is a Linear Function - But that is only one half of the story.

The ReLU is Linear in the range [0, \infty] and non-linear in the range [-\infty, + \infty]. While we generally focus only on the [0, \infty] range, the “0” output in the [-\infty, 0] range is just as important, as it silently helps to control the location of the inflection points that are so crucial for the Neural Network to model just about any output function.

bharathithal · July 21, 2022, 7:29pm

That clears my doubt! In fact, I had that doubt because I assumed that ReLU is linear function and neglected to consider the range. The lab session that week cleared my doubt as well where there’s a beautiful explanation with graph that kinda gives the intuition as to why ReLU is non-linear.

Thanks for clearing that up!

shanup · July 21, 2022, 7:43pm

You are most welcome @bharathithal

Topic		Replies	Views
Choice of activation function Advanced Learning Algorithms week-2	7	683	November 21, 2022
ReLU function as activation function Advanced Learning Algorithms week-2	3	422	July 11, 2023
Activation function in NN NLP with Classification and Vector Spaces week-3	3	332	March 30, 2022
Neural Network functions Advanced Learning Algorithms week-2	3	476	April 22, 2023
Why do you need Non-Linear Activation Functions? Neural Networks and Deep Learning coursera-platform	3	683	March 15, 2022

Activation functions in the hidden layers

Related topics