Optional Lab: Back propagation using a computing graph


Why do we need to compute all the derivatives of the nodes from right to left if we are able to compute βˆ‚π½/βˆ‚π‘€ arithmetically at the beginning of the graph ?

J_epsilon = ((w+0.001)*x+b - y)**2/2
k = (J_epsilon - J)/0.001

Hi @VeronikaS,

It is because we can reuse the result from layers on the right in the layers on the left. It is more apparent if we look at this slide:

See the green arrows in the bottom part of the slide?

If we can reuse something, we save some computation time.


Thank you, @rmwkwok.

One more question about this lab:

do we use the same epsilon for every node of the graph during the back prop?

@VeronikaS, if you are asking about actual backprop when training a model, we don’t really use epsilon because it can introduce rounding errors. We use epsilon in the optional lab just for demonstration purpose and to provide a way to compute derivatives without the need to learn differentiation.


1 Like