I do not understand how this works, so I have 2 questions, first, in this exercise (pool_backward), i think I have everything right except for the last part where we compute dA_prev, at least for the max option which is where The autograder says it is not right.
I should “Set dA_prev to be dA_prev + (the mask multiplied by the correct entry of dA)”, but I do not know which is the correct entry of dA, I used
{moderator edit - solution code removed}
but it does not work, so, any hint or clue that you can give me?
And the other question that might help me is, how does backprop in CNN works? I did not understand it. I know the course says this is hard to understand but if you know where I can learn more I would apreciate it, at this point I am just trying things without any sense for me when doing this exercise.
1 Like
dA indices are actually the training example, the vertical, the horizontal and current channel under consideration!
2 Likes
Just to restate Gent’s point with different words: the point is that the dA value is just one “point” in the output, not a range of pixels. Backward propagation is the mirror image of forward prop: it maps backward to the range of pixels that generated that output “point” during forward prop. And note that conv_backward
worked the same way, right? You’re mapping backward from one point to a range (the filter size) of the input.
2 Likes
Well, I knew that since that is what the pooling layer does but it does not help, I still do not know how to get the correct value or there is something else wrong. Here is the code for the conditional:
{moderator edit - solution code removed}
The output I get is:
mode = max
mean of dA = 0.14571390272918056
dA_prev1[1,1] =
[[0. 0. ]
[6.63920871 1.48408832]
[0. 0. ]]
mode = average
mean of dA = 0.14571390272918056
dA_prev2[1,1] =
[[ 0.51968619 -0.05567156]
[ 1.59203722 -0.39078885]
[ 1.07235103 -0.33511729]]
And should be:
mode = max:
mean of dA = 0.145713902729
dA_prev[1,1] =
[[ 0. 0. ]
[ 5.05844394 -1.68282702]
[ 0. 0. ]]
mode = average
mean of dA = 0.145713902729
dA_prev[1,1] =
[[ 0.08485462 0.2787552 ]
[ 1.26461098 -0.25749373]
[ 1.17975636 -0.53624893]]
1 Like
The problem is what is on the RHS of both assignment statements when you are projecting backward. In the max case, it’s not the max of a_prev_slice
. It’s the single point value of dA
that is getting “projected” backwards onto the max of a_prev_slice
, right?
And in the average case, it is also should be a single point value of dA
on the RHS, but yours is not.
1 Like
Thanks, I got it.
And an advice for those with problems here, write in a paper da, print the variables you have and try to find a way to get a single entry in each iteration using those variables. It ends up making sense if it hadnt before.
2 Likes