Hello, having some issue understanding where I am going wrong here. Understanding the debugging and math behind backpropagation is seeming to be very difficult to understand.
Any help is much appreciated.
{moderator edit - solution code removed}
Hello, having some issue understanding where I am going wrong here. Understanding the debugging and math behind backpropagation is seeming to be very difficult to understand.
Any help is much appreciated.
{moderator edit - solution code removed}
The problem is that you are passing down the entire caches array at every layer. You are supposed to be selecting the appropriate entry in that array to pass down.
Actually maybe what I just did there is counterproductive: I’ve basically given you the answer without teaching you how you could have figured that out for yourself. It is a general debugging principle that just because the error is thrown down in some subroutine, that doesn’t mean that is where the bug is. But first you analyze what happened at the low level at the point where the error is actually thrown:
You can see from the error line that the Z value there is a “tuple” instead of being a numpy array. That’s what that error message is telling you. (Actually that’s probably the First Law of Debugging: believe the error message. If you don’t understand what it is saying, then the first step is you need to work harder at understanding that.) So in this case what that means is that the activation_cache argument that you passed down is incorrect. So you pop the call stack back to the point at which you are calling sigmoid_backward. The contents of the activation cache should just be the value Z, but somehow it is a tuple. Print the type of that value. What do you see? Where did it come from? They gave you the template code to extract the linear cache and the activation cache from the “cache” argument that you passed to linear_activation_backward. So somehow that must be what is wrong, which then leads you back to the top level of the call stack, which is where I spotted the bug in the code that you showed.
@paulinpaloalto thank you so much for all the info. I’m clearly a noob (lol) and trying to learn all I can for every line of code on here. I appreciate it a lot you taking the time to explain this. I was trying to visualize this network and everything going on and this helped me connect the dots. Thanks again.