Hi,
I’m stuck in exercise 9 for the instruction: dA_prev_temp, dW_temp, db_temp = ?
This statement appears twice: before the for loop and inside the for loop.
I think I understood the general principle of the algorithm well but I have trouble with certain technical details.
Hi, @Pierre_BEJIAN. Before we go ahead with the algorithmic intuition have you properly implemented dA_prev_temp, dw_temp, db_temp = ...? In other words, has your L_model_backward(...) passed its tests?
You are returning a dictionary containing the gradient values, i.e. the evaluations of the gradient based on the outputs from forward propagation. Be mindful of the functions “signature”: L_model_backward(AL, Y, caches) and carefully read the “docstring” embedded in in triple quotes following the signature.
If the function has passed its tests (and all previous tests have been passed), then you implemented the proper “helper function” to fill out dA_prev_temp, dw_temp, db_temp = ..., which are returned in the grads dictionary. Why are these useful? Take a look at the ensuing function. Same deal; carefully study the function’s signature and read the docstring carefully.
Doing the same for the previous functions, will help you formulate more precise questions regarding “certain technical details.”
Hi,
Precisely it is the function L_model_backward(…) which poses a problem for me and which did not pass the test. I think I need to use the cache to complete the lines dA_prev_temp, dw_temp, db_temp = ... (before and inside the loop) but I’m a bit lost, even after reading the docstrings.
That’s OK, I encouraged you to take the Gestalt perspective when it comes to the notebooks. Sorry, you’re French. A Merleau-Ponty perspective? I digress. Put differently, the notebooks are structured with a natural flow that helps one see the forest through the trees.
Question: which of the “helper” functions that you have already completed will “help” you to produce values for dA_prev, dW, and db? I can’t answer that for you because it would spoil everyone else’s fun! Here’s a hint: Which of your helper functions returns those values? That information is always in the docstring (function documentation) under “Returns: …”. Optionally, you can jump right to the return statement at the end of the function.
You also noted that this function will be required in two distinct parts of the code. The comments within the body of the function are also useful in this regard. (Comments in Python are preceded with a #.). The first application of the yet-to-be-identified helper function is for the last, or L^{th} layer. To get backward propagation rolling, you must start there. It is on its own because it applies a different activation function from the other layers. It will be entered as a parameter in the yet-to-be-named helper function.
The second application is in the for loop which will roll you back from the final layer to layer L-1 to the beginning. At the end of it all, the function will have evaluated the gradient for the values produced by a single pass through forward propagation.
This notebook is well worth a lot extra time and effort. It is easily the most challenging one in the first course, and for good reason: Forward and backward propagation is the heart and sole of deep learning. It is the prime algorithm. Without it, no deep learning. No deep learning and the world looks a lot different–for good and bad. We hope to keep you away from the dark side! Inspiring, no?
So, go with the flow, and think hard along the way. Slow down. Fast is slow and slow is fast.
Thank you for your encouraging answer which gives lots of valuable advice (careful reading of the docstrings, etc.) I finally managed to complete exercise 9.
And exercise 10 seems easier to me.