C2_W2_Relu Lab - "Why Non-Linear Activations?"

Luciano_Rodriguez · February 14, 2024, 1:50am

Quick question:

I’ve been playing with functions that look like this:
f(x) = g(w1*x + b1) + g(w2*x + b2) + g(w3*x + b3)
g(x) = max(0,x) [ReLU function]

(I’m adding 3 functions as an example, think of “one function per segment”)

Is there a way to model the example function given in the sub-section “Why Non-Linear Activations?” in the lab using ReLU?

I’m specifically referring to the part where the slope becomes negative, I don’t think it can be done because you’re adding, and ReLU makes all functions that you’re adding always >= 0, it makes me think that you can only increase the slope, or keep it constant, with the consecutive functions that you’re adding, am I missing something?

Thanks!

TMosh · February 14, 2024, 3:03am

Remember each unit has a weight value.
Weights can be negative.
That will switch the slope of that segment to a negative value.

Here’s what I’d try first:
Input the critical points in the target (each endpoint and each change of slope) as the training data.

Train a small NN with linear output using ReLU as the hidden layer activation.

Luciano_Rodriguez · February 23, 2024, 12:11pm

Hi TMosh, thanks for the reply, and sorry for the late answer!

Yes, weights can be negative but ReLU will ultimately make the entire operation g(w*x + b) >= 0 again so I don’t see a way to subtract from the function to make its slope negative, it seems like you can just:

add to the slope if the function that you’re adding (let’s say g(w3*x + b3)) is > 0, or
keep it as it is if w3*x + b3 < 0, because g(w3*x + b3) will return:
max(0, (w3*x + b3))

Also I’m sorry, I’m not sure I understood the exercise you suggested to me, could you explain a bit more? Thank you for taking your time to answer!

rmwkwok · February 26, 2024, 1:48pm

We are not talking about the weight inside g(...) to be negative, we are talking about the one that is outside, meaning the weight in the next layer. For example, we can have -3 * g(w*x + b) where w is a weight in the previous layer and -3 is a weight in this layer.

Cheers,
Raymond

Topic		Replies	Views
C2_W2_Relu-Activation Lab Advanced Learning Algorithms week-2	9	613	December 8, 2022
Differences between ReLU and linear for positive values Advanced Learning Algorithms week-2	4	715	January 16, 2023
Why non-linear activation function Advanced Learning Algorithms week-1	3	484	February 9, 2023
Why do we need an activation function? \| ReLU activation Advanced Learning Algorithms week-2	4	622	August 2, 2022
ReLU activation function non linearity AI Discussions	8	51	October 18, 2023

C2_W2_Relu Lab - "Why Non-Linear Activations?"

Related topics