Non linearity of neural networks

Hello,

I was working on C2_W2_Relu Optional Lab and I found that when I change w and b values for relu activation function it only changes the positions of the linear-pieces but they are still linear-pieces. So for bigger neural network lets assume that we have data that directly fits to -x^2 and we have relu activation function in hiden layers, linear activation function in the output layer. And lets assume that our input values in our dataset is vary from -100 to 100. My question is that for predicting really large values or really small values will neural network perform linear behavior instead of parabolic behavior? Because I think that small linear-pieces will fit to data and will output similar values to -x^2 for some values but for big or small values for x there will be only one linear-piece left to take decision and it will perform linear behavior. The idea can be seen more clearly in the image below.

Screenshot 2024-08-19 at 23.41.25

2 Likes

Your intuition is correct, ReLU networks may have difficulty accurately predicting extreme values in your scenario due to ReLU’s linear behavior. In the mid-range (where there are more linear parts), the network may approximate the quadratic curve more effectively. However, for large or small values of x, only one of the linear segments from the ReLU might be active. This results in the network outputting a linear approximation, rather than capturing the full curvature of the quadratic function.

1 Like