I think your name is Luca . Run all the above cells.
I found that there is a spelling mistake. You are using “cashes” but it is “caches”.
Thks Saif… this notebook is very confusing me
Another question if I may; doing the reverse for loop and starting from the lectures we enter forward-propagation through RELU with A[L-1] and we get out and into SIGMOID with A[L] that I pass to the back-propagation and get out with dA[L-1] + dWL and dbL from ; I do not know if I understood correctly, this module is very confusing me.
From Exercise 9 - L_model_backward I see that we respect the theory and we get out with dA[L-1] + dWL and dbL but after we pass through the Relu backward activation and we enter through the reversed for-loop – in (range(L-1) that is saying in comment" Loop from l=L-2 to l=0 " (that I do not get the L-2, why? ) – with dAl+1 (why?) and we get output dAl other than dWl +1 and dbl+1
Hope that is clear… many thks
You described the overall structure correctly: in forward propagation, we first loop through the hidden layers and then do the output layer as a separate step after the “hidden” layer loop.
In back prop, it’s exactly the opposite: we first start by handling the output layer outside the loop. Then we loop backwards through the hidden layers until we get to the first layer.
You just have to carefully work out how the index values work. You need to be clear about the fact that indexing is “0 based” in python. Try running these loops and watch what happens:
for ii in range(5):
print(f"ii = {ii}")
print(f"After loop ii = {ii}")
for ii in reversed(range(4)):
print(f"ii = {ii}")
print(f"After loop ii = {ii}")
Put print statements in your loops in L_model_forward
and L_model_backward
to watch the index values.
Thks, as you said :
I tried to print (l) for the for loop in range(1, L) in the L_model_forward and I got 1-2 → sounds good, since L is 3, right?
But when I tried the for loop l in reversed(range(L-1) in the L_model_backward I got only 0,
did I miss something?
What I do not get is the commenting line for the loop
Loop from l=L-2 ← (why) to l=0
But in that test case L = 2, right? Which means there is one hidden layer and the output layer.
But notice that the logic doesn’t use l directly: it uses l + 1 for the gradients of W and b. We end up generating a gradient for A0, but we just throw that away.
So for the output layer, we compute dW2
and db2
and dA1
. Then for the loop on the hidden layers, we run it only once and produce dW1
, db1
and dA0
and then we’re done.
But the loop should be perfectly general: it will also work in cases where L > 2, right? Just follow through what would happen in a bigger case.
if L=2 and we start from 0 (as the print statement result)
You mean that start from 0 to calculate dW2 , db2 dA1
and then we move to 1 – that is l+1 that we assume – to calculate dW1, db1, dA0 ?
But do not think it’s as I said because still do not understand the commenting Loop from l=L-2 ← (why) to l=0
I do not know exactly what I am saying- think I need sanatory cares
Anyway many thk for the effort
No, remember what we described in the previous post: for Backward Propagation, everything happens in the reverse order. Here’s what I said:
So we compute dW2, db2 and dA1 before the loop. I said it again in my other reply above:
You all right, I told you that I need sanatory cares …my brain is gone if I ever had one
from l=L-2 to l=0 the for loop what is doing here? L-2 what is?
Well we already talked about the fact that L = 2. So what is 2 - 2? Sounds like 0, right?
So that’s because when I printed(l) I got 0
and the last :
As you said, l is not directly used, l+1 instead is used, for W and b… you mean it’s a mere convention?
Molte grazie , I appreciated
Yes, they could have done it differently. But the point is that at a given layer, we generate the dW and db for one layer, but the dA for the layer before. So you need two different index values. You could call them l and l + 1 or you could call them l -1 and l. It’s your choice. Of course that has to match with how you define the “range” on the loop. They chose to do it the first way.
If you assume to start from L-2 it would have more sense call it l-1
But thks for the efforts
Well if L = 2 and you start with l = L - 2, then what is l -1? It’s -1 right? What does that mean as an index value is python? Hint: it means the last entry. How would that work?
White flag , I surrender