C2_W2_Relu activation lab

TMosh · February 29, 2024, 1:06am

I have been curious to learn the details of exactly how this works, so I did an experiment.

I created a training set that is a parabolic curve, where x goes from -5 to +4, and y = x^2.

I set up a 2-layer NN, with one input unit, 5 hidden layer units with ReLU activation, and one output unit with linear activation.

It converged nicely, and here is a plot of ‘y’ and ‘y-hat’.

Here’s what is inside each ReLU unit:
z = max(0, w*x + b)
So each ReLU unit can optimize two values - the slope of its line segment, and the bias value where the output is forced to 0.

If ‘w’ is negative, then the curve looks like '\_".
if ‘w’ is positive, then it looks like “_/”.

Looking just at what the ReLU units are learning, here is a plot that shows the output of each ReLU unit (1 through 5)

You can see that two units have negative ‘w’, and three have positive. All five units have different bias values, which allows each curve to shift vertically. Because of the shape of this training set (all y values are positive), all of the bias values are negative.

All of the units in this example have slightly different slope values - it’s subtle but evident. Here are the biases and weights for the hidden layer:

At the output layer, each of these ReLU outputs is multiplied by an output weight, and added to the output bias.
So again there is a chance to a re-scale each ReLU output before they are all summed-together in the output unit.

Here are the weights and bias for the output unit:

Conclusion:
It is incorrect to say that each ReLU unit learns one segment of a piecewise linear function. Each unit does contribute a linear segment, but the final shape of the model output also depends on the weighted sum of all of the ReLU outputs.

Topic		Replies	Views
C2_W2_Relu - How the ReLu works? Advanced Learning Algorithms week-2	3	713	July 7, 2022
I need to know why we choose this Supervised ML: Regression and Classification week-1	5	384	December 2, 2023
ReLU Lab Questions and joy Advanced Learning Algorithms week-2	7	543	July 9, 2022
Week2 relu lab Advanced Learning Algorithms week-2	10	60	May 29, 2025
C2_W2_Relu Lab - "Why Non-Linear Activations?" Advanced Learning Algorithms week-2	3	276	February 26, 2024

C2_W2_Relu activation lab

Related topics