I cannot understand why the number of steps is N + P can anyone please tell all the nodes and show all the calculations showing it’s step no ?
So that it will be clear that it is in fact N + P.

Vedant Sharma

Hi @Vedbhai

In backward propagation, each node’s activation gradient is computed once, giving N steps and each parameter’s gradient is computed once, giving P steps. Therefore, the total computational complexity is N + P.

If you need more help, feel free to ask, and I can explain it with an example.

