C2_W2_Relu - How the ReLu works?

Zephyrus · July 7, 2022, 7:26am

In this lab, the ReLU activation is described as follows:

The “off” or disable feature of the ReLU activation enables models to stitch together linear segments to model complex non-linear functions.

What I am confused about is if every neuron uses the total dataset, so how did the ReLU activation of every neuron in one layer make these segments, or how does it know where to cut the line to turn to the new one?

rmwkwok · July 7, 2022, 8:38am

Hello @Zephyrus,

This post explained why neurons can act differently - because they are initialized to different values.

Then gradient descent guides neurons to change so that the cost is miniimized.

ReLU itself is also a piecewise linear function (it changes direction at x=0), and this property is “inherited” by function that is addition of any number of ReLU functions. For example, you have 2 ReLUs: ReLU(x) and ReLU(x-1).

ReLU(x) turns at x=0, ReLU(x-1) turns at x=1. If you add the two up, the resulting ReLU(x) + ReLU(x-1) will turn at x=0 first, then turn at x=1 again, so the moment to turn is decided by the parameters w and b in ReLU(wx+b), and those parameters are changed by gradient descent.

Raymond

Zephyrus · July 7, 2022, 9:19am

Thanks for this clear explanation! @rmwkwok

rmwkwok · July 7, 2022, 9:19am

You are welcome @Zephyrus

Topic		Replies	Views
C2_W2_Relu activation lab Advanced Learning Algorithms week-module-2	4	323	March 1, 2024
Why non-linear activation function Advanced Learning Algorithms week-module-1	3	484	February 9, 2023
Week2 relu lab Advanced Learning Algorithms week-module-2	10	64	May 29, 2025
Relu activation NLP with Probabilistic Models week-module-2	1	564	March 14, 2023
C2_W2_Relu-Activation Lab Advanced Learning Algorithms week-module-2	9	617	December 8, 2022

C2_W2_Relu - How the ReLu works?

Related topics