I am having problem in defining current_cache for Lth layer (SIGMOID → LINEAR) gradients.
Is it possible for a mentor to check into assignment because explaining the problem may be complicated.
@paulinpaloalto Need help.
Please realize that no-one else can see your assignments. Note that the handling of the caches is the same in every layer. There is nothing different in that respect for the output layer vs in the hidden layers.
Oh, actually in a python course on coursera, mentors could check into the assignments. Sorry, can I send you the code in anyway?
My lab ID is jnzchluk
As I mentioned earlier, I cannot look at your assignments, so giving me the lab id serves no purpose.
Why don’t you start by showing us the error message you are getting. If we can’t figure it out from that, then you can DM me the code. But it’s not supposed to be my job to do your work for you, so why don’t we start by seeing if we figure out how to debug this by working from the symptoms.
Sir I didnt mean that you have to do my work by sharing my Lab ID. Actually I read in help that I instructors can see your lab work if I share my Lab ID. What I intended was that maybe it’d help you to understand the problem better.
Anyway here is the error that is ocurring.
The problem is not with the cache value. I think you are handling that correctly. The problem is with the dA value that is being passed down to relu_backward. You are always passing dAL when you call linear_activation_backward, instead of the dA from the following layer. So the dimension of dA does not match the dimension of Z, which is what the error message is telling you. You can click “File → Open” and then then open dnn_utils.py to read the code in relu_backward to understand what it is doing, which will help to interpret the error message as I did above.
BTW it may well be that the course staff from deeplearning.ai can use your lab id, but the mentors are just fellow students. We do not get paid to do this and we do not have any magic powers to view other people’s assignments.
I tried replacing dAL with dA but the error is showing dA is undefined.
So where is dA defined? How do you find the dA^{[l]} value for layer l in that computation?
The point being that you need to think a bit harder about what is actually happening in that for loop. It takes as input the dA from the following layer and produces the dA for the current layer (among other things). Rinse and repeat …
As far as I understand, the for loop loops through the reverse range of the layers excluding the last layer. linear_activation_backward computes dA and other values for previous layer and the cycle repeats. But first I need to define dA to feed in linear_activation_backward for last second layer.
Uhhhh, I think i got it. I feeded the dA_prev_temp from the linear activation backward of the last layer written above, and it worked. Was that what I was missing?
Right. The point is that it is dAL for the second to last layer. But what is it for the third to last layer or the fourth to last layer? You just computed it on the previous iteration of the loop, right? Did you actually study the formulas shown in the instructions?
Yes, at least that’s my guess from the error that you showed. See if things work now and the tests pass.
Sorry, maybe my mind just skipped that while working on the exercise. I corrected it as I realised it.
Great! That’s how it’s supposed to work, right? Nothing works the first time, but you learn something as you go through the process of figuring out what went wrong. Onward!
So true! Thank you for guidance and that insight.