A Long Confusion Solved

Hi Folks! I was struggling in the optional lab ReLU activation in week 2 of Advanced Learning Algorithms. What I learned from that lab will be mentioned here.

Upon seeing the lab I was thinking that in order to make 3 different patterns we have to mold 3 different linear lines to make this shape:

Part-1:
I noticed the first part of graph there was decrease of y = 2 for each inc of 1 in x. Such that slope must be -2. The equation for the first line should be like this:
y = -2x + 2 so that when
x = 0 → y = 2
x = 1 → y = 0
x > 1 → y < 0 but we mapped it to 0 (as we applied relu function)
x < 0 → y > 0 it is still working and not mapped to anything. We are just not showing it here.

So till now we have mapped all values of x > 1 → 0 and x < 1 to the first part. Everything seems good and working.

Part-2:
For next part we got inc of 1 in y for each inc of 1 in x so slope will be 1. So eq becomes
y = 1x - 1 so that when
x = 1 → y = 0
x = 2 → y = 1
x < 1 → y < 0 but we mapped it to 0 (as we applied relu function)
x > 2 → y = x - 1 it is still working and not mapped to anything. (! remember !)

So till now we have mapped all values of x greater than 2 to x - 1. Till now everything seems good and likewise working.

Part-3:
Now the next critical part starts. Upon observing the third part I was definitely sure it would be following the previous pattern. For each inc of 1 in x we got inc of 3 in y. So slope is 3 and we got:
y = 3x - 5 so that
x = 3 → y = 4
x = 2 → y = 1

But this does not work well as evident from graph. Before discussing problems that we are facing remember that we are adding a11 + a12 + a13 from all three neurons in first layer to get resultant a2 that we print in the graph. Here try to find issues yourself after that read bellow:

  1. At x = 2 we got both neurons 2 and 3 active and both gave value of 1. So a2 = a11 + a12 + a2/y = 0 + 1 + 1 = 2 (we wanted y = 1). This also happened when x was 1 but as both a11 and a12 returned 0 it didn’t matter much.
  2. For x < 2 we wanted neuron 3 to stop and let neuron2 to only contribute but neuron 3 didn’t give y < 0 immediately for all x < 2. So before we got y = 2 for x = 1 we can see increase in value of y that deviates from previous graph that was performing well on target for all x < 2.
  3. Another interesting problem (that took me time to notice) is that why do we get y = 6 when x = 3. What we got in equation3 was:
    y = 3x - 5, upon substituting:
    x = 3 → y = 4
    But we got 6. Here first think yourself what can be the reason.
    As I mentioned in the previous part 2, we didn’t ignore the case where neuron 2 was contributing slope of 1 for each x. Starting at x = 1 we always get inc of unit slope even when x became greater then 2. That thing is being reflected upon our graph here. We got when x = 2 → a2/y = 0 + 2 + 4 = 6. Remember we have to accommodate for previous graph/parts also and take care of them also.

So, now what we will do is that:
y = eq of line 3 - eq of line 2
y = 3x - 5 - (x - 1)
y = 2x - 4

This can be understood in same manner that slop of 1 is contributed by neuron2 and slop of 2 is contributed by neuron 3 which add to slope of 3. Whereas, biases also works in same matter to reflect the collective output we want on the graph.

Hope you liked it. Thanks for reading.

Thanks for your analysis.