After watching the video “computation graph” from week 2, I wonder: why do we use backward propagation while we have analytical formulas to compute the derivatives?
Is it because, in case of some specific activation functions, we don’t have such formulas?
Backpropagation is a crucial step in neural networks. It involves calculating the derivatives of the cost function and propagating these derivatives from the last layer to the first.
Each layer in the network computes its derivatives and passes them back to the preceding layer.
@Etienne_Cuisinier, those equations are incomplete with regard to neural networks. The reason is that in a hidden layer, we don’t have the y(i) values. The y(i) values only apply to the output layer.
Backpropagation is how we compute the gradients for the hidden layer.