Hello,
This might be a bit of a silly question but i don’t understand why the for loop in L_model_forward is:
for l in range(1, L):
and not
for l in range(1, L-1):
Thanks for the help
Hello,
This might be a bit of a silly question but i don’t understand why the for loop in L_model_forward is:
for l in range(1, L):
and not
for l in range(1, L-1):
Thanks for the help
Hi @SuitaX , welcome to this channel.
Regarding to your question, you can use the “initialize_parameters_deep()” comments as reference. Note that if I use (1,L) and I have only one layer, there is the explanation about what I get. By the other side, if I use (1, L-1), it means (1, 0) or the for loop will never be executed.
Hello Carlosrl,
Thank you for your help !
So i quote the first “initialize_parameters_deep” instruction :
“The model’s structure is [LINEAR → RELU] × (L-1) → LINEAR → SIGMOID. I.e., it has 𝐿−1 layers using a ReLU activation function followed by an output layer with a sigmoid activation function.”
From what i understand if L=1 there is no relu layer and just the final sigmoid layer and thus i would find it normal for the for loop not to work.
I know i am wrong, i tried with “for l in range(1, L-1):” and it doesn’t work… i will give some more thoughts.
Hello @SuitaX !
I think the answer is related to python loops. The behavior of the loop is:
Here’s an example:
You can see that we never get to 6 in the above example. The loop terminates at 5.
So, in range(1, L), we are not calculating the activation for the final layer(L). The last activation that the loop calculates is A_L-1(activation of the second to last layer.) The next part of the code, outside the loop, calculates A_L (activation of the last layer.) Thus sigmoid is only applied the A_L, while the rest have ReLu as their activation function.
hello @shwetank
That’s it… I was thinking from 1 to n.
Thank you, i really appreciate !
Yes, indexing in python is all “0 based”, which includes both array indices and loop indices. Here’s the analog of the experiment that @shwetank showed us above for loops, but with array indices:
>>> v = np.array([1,2,3,4])
>>> len(v)
4
>>> v[1]
2
>>> v[0]
1
>>> v[3]
4
>>> v[4]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index 4 is out of bounds for axis 0 with size 4
>>>
Thank you, this is helpful. I understand the behavior of the loop better now.