I get the following error. I’ve read almost all topics similar to this one and usually users said their problem was solved when they initiated dA_prev with the size of A_prev. I’ve already done that but I still get the same error.
Thanks so much for your help and Happy Hollidays!
(5, 4, 2, 2)
(5, 5, 3, 2)
ValueError Traceback (most recent call last)
in
7 dA = np.random.randn(5, 4, 2, 2)
8
----> 9 dA_prev1 = pool_backward(dA, cache, mode = “max”)
10 print(“mode = max”)
11 print('mean of dA = ', np.mean(dA))
in pool_backward(dA, cache, mode)
50
51 # Set dA_prev to be dA_prev + (the mask multiplied by the correct entry of dA) (≈1 line)
—> 52 dA_prev[i, vert_start: vert_end, horiz_start: horiz_end, c] += mask * dA[i, h, w, c]
53 #print(dA_prev.shape)
54 elif mode == “average”:
ValueError: non-broadcastable output operand with shape (2,1) doesn’t match the broadcast shape (2,2)
The line of code that “throws” looks correct to me, so the question is which of the inputs is the wrong shape. Maybe your “create mask” routine is incorrect or you invoked it incorrectly. Print the shapes of everything before the line that “throws” and that should give you some direction about which one is wrong.
1 Like
Also note that it’s worth stepping back and making sure you understand the point of that code. This is for the “max pooling” case, so what is happening is that the function of the mask is to apply the gradients just to the elements of dA_prev that were the maximum values, right? So on the RHS of that assignment, dA[i, h, w, c] is a scalar value and mask should be the same shape as the indexed subset of dA_prev on the LHS.
So why did that not work out in your case? In addition to the “create mask” routine being wrong, it could also be some problem with the stride or the dimensions not being handled correctly. Just as with forward propagation, the loops are over the output space but in this case we are applying the gradients in the input space, so we need to calculate where that is in each iteration. The loop logic and the way we manage the “prev” index values is the same as in forward propagation.
2 Likes
Thanks so much for the response. After printing all dimensions before that line I realized my LHS of that line is the problem and the horiz and ver should be exchanged. It should be dA_prev[…, horiz_start:horiz_end , vert_start:vert_end, …] instead. I think I inherited this LHS from the notebook before I completed it with the RHS and the order of horizontal and vertical was incorrect(?) in the notebook recommendation. Now I passed all the tests. Many thanks
It’s great that you found the solution. Yes, it’s a bit confusing but we have to remember that h stands for height (not horizontal) and w for width …
2 Likes