DLS C1 WK4 - Building Blocks of Deep Neural Networks - Caching Z

Shashi_K · December 18, 2021, 10:43am

While going through the lecture video, I tried to refer the math used in calculations from Forward and Back Propagation.

I noticed that prof Andrew mentioned about caching z, w, b from each layer but did not mention anything about variable a. But the math explicitly uses a but not z in the formulae
I was also trying to make sense of his statement that in Back Prop. the first building block from the end takes da[l] as input and outputs da[l-1]. But I do not find any support for this statement in the math

Please let me know if I am missing something. Attached are few screen shots for your reference.

paulinpaloalto · December 18, 2021, 3:28pm

You’ll notice that the formulas do include both A and Z, but it is the A from the previous layer. When you get to the assignment and see how the caches are actually constructed, you will see that both values are cached.

For question 2), I think it’s probably just a misintrepetation of what he says. Notice that dA shows up in the pictures, but nowhere in the formulas. At the output layer you start with A^{[L]} and that gives you dZ^{[L]}, which is later used to compute dZ^{[L-1]} and hence dW^{[L-1]}. The whole process is just a huge serial application of the Chain Rule. Since all the derivatives are w.r.t. J, the output of the very last function in the chain, everything at a given layer depends on all the later layers. Remember that in Prof Ng’s simplified notation:

dW^{[l]} = \displaystyle \frac {\partial J}{\partial W^{[l]}}

All the gradients are partial derivatives of J w.r.t. the parameter in question.

Shashi_K · December 19, 2021, 11:10am

Thank you for the detailed explanation

Topic		Replies	Views
Building Block NN cache Z,W,b Neural Networks and Deep Learning coursera-platform	1	558	May 21, 2021
W4 - Programming assignment 1 Neural Networks and Deep Learning coursera-platform	7	547	November 4, 2022
DLS1 week 4: forward and backprop, a[l-1] in a cache? Neural Networks and Deep Learning coursera-platform	1	542	July 26, 2021
Don't we need to cache a[l] instead of z[l] in forward propagation? Neural Networks and Deep Learning week-module-4	2	39	June 21, 2025
Assignment Building NN C1 Week 4 Neural Networks and Deep Learning coursera-platform	11	650	August 16, 2022

DLS C1 WK4 - Building Blocks of Deep Neural Networks - Caching Z

Related topics