Convolution followed by relu

Convolutional Neural Networks
#week1
#relu
Deep Learning Specialization
Hi i have a quastion
First lest assume we have a convnet and each of them has a relu activation .
Well, relu doing well in the first hidden units and now we have all out inputs >= 0 ,but for the next layers when we use relu it doesn’t change anything and they are still >=0 .overall, it make our convnet linear and it just like a logistic regression.
Can you tell me why it works well and where is my mistake?

1 Like

Hi @mhaydari81

A function is considered linear whenever a function f:A→B if for every x and y in the domain A has the following property:
f(x)+f(y)=f(x+y)

ReLU is not linear (f(−1)+f(1)≠f(0)) and it is used to introduce non-linearity in neural networks. While it may seem like ReLU makes the network linear, the depth of the network and the training process enable it to learn non-linear representations effectively.

To show that the composition of ReLU activations introduces non-linearity, consider two inputs x_1 and x_2, and their corresponding output y_1 and y_2 after passing through the convolutional layer and ReLU activation:

y_1 = ReLU(Wx_1 + b)
y_2 = ReLU(Wx_2 + b)

Now, let’s consider a linear combination of these outputs

\begin{align*} ay_1 + by_2 &= a \text{ReLU}(Wx_1 + b) + b \text{ReLU}(Wx_2 + b) \\ &= a \max(0, Wx_1 + b) + b \max(0, Wx_2 + b) \end{align*}

The max function is non-linear, so the expression ay_1 + by_2 is non-linear with respect to x_1 and x_2, which proves that the composition of ReLU activations introduces non-linearity into the CNN.


Link for more information:

1 Like

Hope the explanation above helps, feel free to ask if you have any questions.

Thank you for your detailed response.
Well let’s assume that w and b and x is already more than zero therefore it doesn’t have any effect in our equation. And it makes our equation linear!
So i have a question!
In every equation is there a negative variable(between w and b and x) to make it non linear and effect as a non liniearity in out equation?

Hi @mhaydari81 ,

Relu is a piecewise linear function, positive input will be output as is, and negative value is output to zero.

1 Like

During the training phase, the weights (W and b) are updated and can have negative values. I think that you only consider the positive part of ReLU, and that confuses you. If you treat it as @Kic mentioned, your problems will be solved!

There are other types of ReLUs that can be helpful if you take a look at them:

  1. ReLU
  2. Leaky ReLU
  3. Parametric ReLU
  4. Exponential Linear Unit
  5. Scaled Exponential Linear Unit
1 Like

Thank you very much for your assistance.

You’re welcome! happy to help :raised_hands:

Good point.
This is the key concept that isn’t initially obvious.

Typically ReLU is presented as a function of some value ‘z’, but they don’t emphasize that z = w*x + b. There are weights and biases that must be learned within a unit that uses ReLU activation.

1 Like