it is possible to have more details about why the computational complexity using a computation graph moves from NxP to N+P? I can’t easily grasp it from the lectures.

I happen to remember this reply of mine so I am sharing it here. Next time, please do as @Deepti_Prasad asks, because we need your help to see what you saw.

As Andrew mentioned in the course, if we want to compute derivatives left to right (Forward prop),

Then we should compute it for each parameters one at a time. So it is: NxP.

But then, back prop method let us compute all parameter’s derivatives in just one run. So it is: N+P.

For completeness of the question, it is referred to this lecture:

Advanced Learning Algorithms → Week 2 → Computation graph (Optional)

Computation graph (Optional) | Coursera

timestamp ~ 17:50

