Exercise 6 - backward_propagation in Programming Assignment Week 3

In this markdown, Figure 1 shows

dZ2 = A2 - Y
dW2 = (1/m) dZ2 A1.T

I think dZ2 = (1/m) (A2 - Y), dW2 = dZ2 A1.T. Is that correct? I’m sorry if I made a mistake in calculation.


Not sure what your query is, could you please provide more details.

My question is exactly the same as discussed in this thread.

In the DLS Course 1 week 3 and 4 video lessons or programming assignments, the differential equation of backpropagation is described as follows.
dZ[L] = A[L] - Y
dW[i] = (1/m) dZ[i] A[i-1].T
db[i] = (1/m) np.sum(dZ[i], axis=1, keepdims=True) (1<=i<=L)

Specifically, they appear in the following situations:
・All scenes dealing with backpropagation Vectrization in the lecture videos of the 3rd and 4th weeks
It first appears in the blackboard around 7:30 in the video of https://www.coursera.org/learn/neural-networks-deep-learning/lecture/Wh8NI/gradient-descent-for-neural-networks

・clarification-for-what-does-this-have-to-do-with-the-brain document for modifying the typo in the video of this section

・Exercise-6 backward_propagation cell in Week3 programming assignment (when L=2)

・Some people say that the same formulas will appear in the Week 4 assignments (I haven’t done the Week 4 assignments yet, so I don’t know)

But I suspect that this backpropagation differential equation is a typo.
dZ[L] = (1/m) (A[L] - Y)
dW[i] = dZ[i] A[i-1].T
db[i] = np.sum(dZ[i], axis=1, keepdims=True) (1<=i<=L).

It seems that typographical errors have occurred in all the scenes I mentioned above, so please confirm.
In particular, some of the people who did the Week 4 assignments claim that points will be deducted for writing logically correct formulas, so I would appreciate it if you could confirm it as soon as possible.

I have solved the program assignments for Week4, but it seems that the trouble I mentioned earlier is occurring in the Exercise 7 - linear_backward cell of Building_your_Deep_Neural_Network_Step_by_Step.

def linear_backward(dZ, cache): In the cell starting with

{moderator edit - solution code removed}

is considered an incorrect answer and an error is thrown in the assert statement.


Are you implying that the equations shown in the figure are incorrect ? They are not.

Hi, @Mubsi,
What do you mean by “the equations shown in the figure”?

As Mubsi says, the equations as written are correct. I answered on the other thread where you also asked this question. Here’s an older thread that also discusses this point.

For the specific chunk of code you show, you are not supplying the factor of \frac {1}{m}, so presumably you did it “your way” and included that in the dAL calculation at the outer layer. You may be able to get that to work if you make the adjustments throughout, but I think the better strategy is just to follow the way Prof Ng does it. That will just be simpler than trying to rewrite everything your way.

Note that a generalization of the formulas shown will also be used in Week 4 where we finally get to full generality and can have any number of hidden layers. Prof Ng is consistent in doing it the same way there: he does everything with derivatlves of L until the final step at each layer of computing the dW^{[l]} and db^{[l]} as derivatives of J.

I understood what the formula means owing to you. Thanks!


2022年10月26日(水) 0:36 Paul Mielke via DeepLearning.AI <notifications@dlai.discoursemail.com>: