Assignment E9 L_model Backward

I completed this but wanted to make sure I understand it right ! Please correct me know if I am mistaken somewhere.

Step 1 : it calculates dL/dA using some new formula of np.divide
Step 2 : We calculate the gradients for Lth layer. Since the last activation function is sigmoid we calculate it using linear_activation_backward with Lth later cache.

  • in this step we calculate dA[L-1], dW and db for Lth layer
  • caches are basically the values of A[l] and Z[l] that we have stored while calculating the forward propagation.

Step 3. First we loop from L-1 to 1 traversing through the layers in reverse.

  • Then we calculate A of previous layer for each l (small L) using previously calculated A[l] cache for that particular layer and activation function “relu”
  • Then we keep storing all the calculated dA[1 … L-1] and dAL
  • dW [1…L]
  • db [1 …L]
  • then we return this so that we can use this gradients to update in each iteration later.
  • This function does one backward propagation from L to 1

Yes, your description of what should be happening sounds right to me. Does your code pass the tests in the notebook? Note that the test case is a bit underwhelming, in that it uses only a 2 layer network, so the loop over the hidden layers executes only once.

Of course there are always some details to deal with when you go from the verbal description of an algorithm to the actual code. E.g. notice that the caches array has L elements in it, but indexing in python is 0-based. So that means the caches entry for layer 1 (the first layer) is caches[0], right?