Hello!

Why do we need to compute all the derivatives of the nodes from right to left if we are able to compute βπ½/βπ€ arithmetically at the beginning of the graph ?

J_epsilon = ((w+0.001)*x+b - y)**2/2

k = (J_epsilon - J)/0.001

Hello!

Why do we need to compute all the derivatives of the nodes from right to left if we are able to compute βπ½/βπ€ arithmetically at the beginning of the graph ?

J_epsilon = ((w+0.001)*x+b - y)**2/2

k = (J_epsilon - J)/0.001

Hi @VeronikaS,

It is because we can reuse the result from layers on the right in the layers on the left. It is more apparent if we look at this slide:

See the green arrows in the bottom part of the slide?

If we can reuse something, we save some computation time.

Cheers,

Raymond

Thank you, @rmwkwok.

One more question about this lab:

do we use the same epsilon for every node of the graph during the back prop?

@VeronikaS, if you are asking about actual backprop when training a model, we donβt really use epsilon because it can introduce rounding errors. We use epsilon in the optional lab just for demonstration purpose and to provide a way to compute derivatives without the need to learn differentiation.

Cheers,

Raymond

1 Like