ReLU activation function. (How is it activating things ?)

Somil · October 12, 2021, 10:27am

As mentioned, the activation function should not be linear. Still, the ReLU activation function is semi-linear; its derivative is 1 for positive input values and zero for negative input values then, why are we using it? Suppose we get only positive input value for a certain layer, then the whole layer becomes redundant it will only increase the computation. Would you please tell me what could be done in such scenarios?

kenb · October 12, 2021, 1:11pm

There will be more to come on this topic in Course 2. As a brief preview, it is a common practice to “normalize” the inputs to not only the feature matrix (X = A^{[0]}), but also the inputs of subsequent layers. By normalization, I mean standardizing the inputs by subtracting the mean and dividing by the standard deviation. Result: a zero-mean input. Positive and negative values guaranteed!

Somil · October 12, 2021, 1:48pm

Thankyou for your help

Topic		Replies	Views
Why do we need Activation function Neural Networks and Deep Learning	4	544	February 16, 2023
Differences between ReLU and linear for positive values Advanced Learning Algorithms week-2	4	721	January 16, 2023
RELU vs linear activation Supervised ML: Regression and Classification week-3	4	644	February 15, 2023
Scaling after layer with relu/linear activation Advanced Learning Algorithms week-2	2	491	February 23, 2023
Why do we need an activation function? \| ReLU activation Advanced Learning Algorithms week-2	4	622	August 2, 2022

ReLU activation function. (How is it activating things ?)

Related topics