Backpropogation

Y_L1 · April 26, 2025, 10:21pm

Consider a neural network with 2 or more layers. After we update the weights in layer 1, the input to layer 2 (a(1)) has changed so ∂z/∂w is no longer correct as z has changed to z* and z* != aw + b. We are able to only take the partial derivative of the loss with respect to w for the first layer, as when we take the partial derivative of the loss with respect to subsequent layers we must hold all variables constant, and an update to earlier layer/layers prevents this.

TMosh · April 26, 2025, 10:38pm

Use the chain rule.

Y_L1 · April 27, 2025, 12:52am

I fully understand the mechanics of the chain rule. I don’t think you follow the argument in my post.

TMosh · April 27, 2025, 4:15am

You are correct.

rmwkwok · April 27, 2025, 8:35am

Hello @Y_L1,

In principle, we can calculate the partial derivative with respect to each and every weights before getting any of them updated, so to this end, there will not be the problem you mentioned.

In practice, as the name “back propagation” suggests, we compute the derivatives from the last layer back to the first, and we update the weights from the last layer backwards. Using your example, that will be to update layer 2 first before layer 1. Moreover, the calculations of any derivative are based on results cached during the forward phase, which means we will only use “z” and never use any “z*”. The key here is we need to, during the forward phase, cache all the results required to calculate the derivatives in the backward phase, and Andrew has shown that in the lectures.

Cheers,
Raymond

Topic		Replies	Views
W3_A1_Derivative for hidden neural layers (Backprop) Neural Networks and Deep Learning	5	608	February 9, 2023
W2_A2_Calculation of Partial derivatives Neural Networks and Deep Learning	12	1007	July 24, 2023
Backprop derivative after output later - Course 1: Week 3 Neural Networks and Deep Learning	4	606	June 22, 2024
Derive backpropagation in CNN Convolutional Neural Networks	1	563	July 27, 2022
DLS C1 WK4 - Building Blocks of Deep Neural Networks - Caching Z Neural Networks and Deep Learning	2	527	December 19, 2021

Backpropogation

Related topics