Why non-linear activation function

I understand without non-linear activation function models would not be able to learn complex relationships between inputs and outputs. However, the explanation in the optional lab confused me.

  1. Why are there different segments in the diagram? For neurons in the same player, aren’t they working simultaneously instead of sequentially?
  2. Why unit 0 is fixed?
  3. What do “target” and “match target” mean in the diagram?

I’m sorry if there sounds like tons of I don’t understand…

Hello @Well_Zhang,

In the above, I have two ReLUs, if I add them up (ReLU(x) + ReLU(x-2)), then in the final curve, how many segments will there be?

For when I have three ReLUs, how many segments will there be in the final curve that add all of them up?

Raymond

1 Like

In the last diagram there should be three segments right? [0,2), [2, 4), and [4,)

Thank you for the comment and I think I’ve got the point after trying out the interactive exercise in the lab. But a final question is, in real neural network training process, will each unit be responsible for a specific segment? or they just all contribute to the entire model without clear distribution of work?

almost. There are 4 segments: (-inf, 0), [0, 2), [2, 4), [4, inf), but I think you have got the idea.

Yes, the idea is pretty similar, except that we don’t just sum the ReLUs up as equal, instead we weighted sum them up: w_0\text{ReLU}(z_0) + w_1\text{ReLU}(z_1) + w_2\text{ReLU}(z_2) +.... In my last example, 3 ReLUs give 4 segments, and sometimes (in the optional lab), 3ReLUs can just give 3 segments. Instead of definitely saying how many ReLUs will give how many segments, I prefer to just say, more ReLUs can give you more segments, which means that a very complicated curve will require more ReLUs to get a perfect fit.

If you ask for what a ReLU of layer k contributes to the output of the same layer k, then the idea is exactly like what the optional lab tells you. But if you ask what a ReLU of layer k do to the output of another layer k+3, then it becomes difficult to say, or in your words, unclear to say, even though you can always test it out.

Cheers,
Raymond

2 Likes