I am not able to understand how we eliminated the second loop that run dw1, dw2 and consolidated it to just dw. Can someone please help.
Hey @Harshita_Gupta1,
Welcome to the community. Can you please let us know which week and which assignment are you referring to?
Cheers,
Elemento
Hello, thank you for replying. This is Course 1, Week 2 - Vectorizing Logistic regression
I am unable to understand how we eliminated the loop of dw1, dw2 to make a vector of only dw
Hey @Harshita_Gupta1,
I am assuming here you are referring to the implementation of the optimize
function. I guess you can easily find the answer to your query by understanding the structure of dw
. Let’s say that you have 10 neurons. In that case, dw1
will contain the gradient of w1
wrt the loss function. Similarly, dw2
will contain the gradient of w2
wrt the loss function, and so on. In this case, we will write the code for updating the parameters as follows:
w1 = w1 - learning_rate * dw1
w2 = w2 - learning_rate * dw2
and so on. Now, you can easily create a vector dw
to store all the dw(i)
, i.e., dw = [dw1, dw2, ....., dw10]
, and similarly, a vector w
to store all the w(i)
, i.e., w = [w1, w2, ....., w10]
. Once you have created the vectors dw
and w
, all you need to use is the same update rule, but with the vectors that you have created instead of the scalar quantities, and it will update all the 10 weights together. The code will be as follows:
w = w - learning_rate * dw
Let me know if this helps.
Cheers,
Elemento
This was very helpful. Thank you so much