Convolution followed by relu

mhaydari81 · April 23, 2024, 7:49am

Convolutional Neural Networks
#week1
#relu
Deep Learning Specialization
Hi i have a quastion
First lest assume we have a convnet and each of them has a relu activation .
Well, relu doing well in the first hidden units and now we have all out inputs >= 0 ,but for the next layers when we use relu it doesn’t change anything and they are still >=0 .overall, it make our convnet linear and it just like a logistic regression.
Can you tell me why it works well and where is my mistake?

Alireza_Saei · April 23, 2024, 9:43am

Hi @mhaydari81

A function is considered linear whenever a function f:A→B if for every x and y in the domain A has the following property:
f(x)+f(y)=f(x+y)

ReLU is not linear (f(−1)+f(1)≠f(0)) and it is used to introduce non-linearity in neural networks. While it may seem like ReLU makes the network linear, the depth of the network and the training process enable it to learn non-linear representations effectively.

To show that the composition of ReLU activations introduces non-linearity, consider two inputs x_1 and x_2, and their corresponding output y_1 and y_2 after passing through the convolutional layer and ReLU activation:

y_1 = ReLU(Wx_1 + b)
y_2 = ReLU(Wx_2 + b)

Now, let’s consider a linear combination of these outputs

\begin{align*} ay_1 + by_2 &= a \text{ReLU}(Wx_1 + b) + b \text{ReLU}(Wx_2 + b) \\ &= a \max(0, Wx_1 + b) + b \max(0, Wx_2 + b) \end{align*}

The max function is non-linear, so the expression ay_1 + by_2 is non-linear with respect to x_1 and x_2, which proves that the composition of ReLU activations introduces non-linearity into the CNN.

Link for more information:

Alireza_Saei · April 23, 2024, 9:44am

Hope the explanation above helps, feel free to ask if you have any questions.

mhaydari81 · April 23, 2024, 11:19am

Thank you for your detailed response.
Well let’s assume that w and b and x is already more than zero therefore it doesn’t have any effect in our equation. And it makes our equation linear!
So i have a question!
In every equation is there a negative variable(between w and b and x) to make it non linear and effect as a non liniearity in out equation?

Kic · April 23, 2024, 12:41pm

Hi @mhaydari81 ,

Relu is a piecewise linear function, positive input will be output as is, and negative value is output to zero.

Alireza_Saei · April 23, 2024, 1:17pm

During the training phase, the weights (W and b) are updated and can have negative values. I think that you only consider the positive part of ReLU, and that confuses you. If you treat it as @Kic mentioned, your problems will be solved!

There are other types of ReLUs that can be helpful if you take a look at them:

ReLU
Leaky ReLU
Parametric ReLU
Exponential Linear Unit
Scaled Exponential Linear Unit

mhaydari81 · April 23, 2024, 1:35pm

Thank you very much for your assistance.

Alireza_Saei · April 23, 2024, 1:53pm

You’re welcome! happy to help

TMosh · April 23, 2024, 5:11pm

Good point.
This is the key concept that isn’t initially obvious.

Typically ReLU is presented as a function of some value ‘z’, but they don’t emphasize that z = w*x + b. There are weights and biases that must be learned within a unit that uses ReLU activation.

Topic		Replies	Views
Relu activation NLP with Probabilistic Models week-module-2	1	564	March 14, 2023
ReLU activation function non linearity AI Discussions	8	52	October 18, 2023
Why do we need Activation function Neural Networks and Deep Learning coursera-platform	4	544	February 16, 2023
How non linear is ReLU? Neural Networks and Deep Learning coursera-platform	4	791	March 17, 2023
C1_W3-Non-Linear_Activation_Function Neural Networks and Deep Learning coursera-platform	1	550	May 18, 2021

Convolution followed by relu

Related topics