Wrong result on exercise 3 conv_forward

Here is some output (sorry for the volume!) and the error message at the bottom. I am not including my code as I know it is not allowed. Pls let me know if I can send it privately.

n_C_prev =  4
n_H_prev =  5
n_W_prev =  7
n_C =  8
m =  2
f =  3
stride =  2
pad =  1
n_H =  3
n_W =  4
A_prev_pad dimensions:  (2, 7, 9, 4)
a_prev_pad dim =  (7, 9, 4)
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.          1.62434536  0.86540763]
 [ 0.         -0.26788808 -0.6871727 ]]
weights =  [[-0.27909772 -2.03720123  0.58464661]
 [-1.30653407  0.84086156 -0.0693287 ]
 [ 0.52887975  0.63658341  0.85328219]]
Z =  -0.8498461589307214
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.         -0.61175641 -2.3015387 ]
 [ 0.          0.53035547 -0.84520564]]
weights =  [[-0.43750898 -0.66134424  0.00854895]
 [-0.785534    1.10417433 -1.68405999]
 [ 1.86647138  0.1892932  -0.35340998]]
Z =  3.681308111668036
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.         -0.52817175  1.74481176]
 [ 0.         -0.69166075 -0.67124613]]
weights =  [[-0.18657899  0.28267571 -0.70134443]
 [-0.11598519  0.31027229 -0.53223402]
 [-0.76730983  0.69336623 -0.27584606]]
Z =  -1.846881672398074
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.         -1.07296862 -0.7612069 ]
 [ 0.         -0.39675353 -0.0126646 ]]
weights =  [[ 0.58591043  1.12232832  0.35249436]
 [ 0.79452824  0.35016716 -0.59384307]
 [ 0.63019567  1.76041518  1.03298378]]
Z =  0.009139474876398657
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.86540763  0.3190391  -0.3224172 ]
 [-0.6871727  -1.11731035 -0.19183555]]
weights =  [[-0.27909772 -2.03720123  0.58464661]
 [-1.30653407  0.84086156 -0.0693287 ]
 [ 0.52887975  0.63658341  0.85328219]]
Z =  -3.477259710673636
a_slice_prev =  [[ 0.          0.          0.        ]
 [-2.3015387  -0.24937038 -0.38405435]
 [-0.84520564  0.2344157  -0.88762896]]
weights =  [[-0.43750898 -0.66134424  0.00854895]
 [-0.785534    1.10417433 -1.68405999]
 [ 1.86647138  0.1892932  -0.35340998]]
Z =  1.0416450131981339
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 1.74481176  1.46210794  1.13376944]
 [-0.67124613  1.65980218 -0.74715829]]
weights =  [[-0.18657899  0.28267571 -0.70134443]
 [-0.11598519  0.31027229 -0.53223402]
 [-0.76730983  0.69336623 -0.27584606]]
Z =  1.0599109550251988
a_slice_prev =  [[ 0.          0.          0.        ]
 [-0.7612069  -2.06014071 -1.09989127]
 [-0.0126646   0.74204416  1.6924546 ]]
weights =  [[ 0.58591043  1.12232832  0.35249436]
 [ 0.79452824  0.35016716 -0.59384307]
 [ 0.63019567  1.76041518  1.03298378]]
Z =  3.017925253111515
a_prev_pad dim =  (7, 9, 4)
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.         -0.18656977  0.52946532]
 [ 0.          2.19069973  2.52832571]]
weights =  [[-0.27909772 -2.03720123  0.58464661]
 [-1.30653407  0.84086156 -0.0693287 ]
 [ 0.52887975  0.63658341  0.85328219]]
Z =  1.9595390793946623
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.         -0.10174587  0.13770121]
 [ 0.         -1.89636092 -0.24863478]]
weights =  [[-0.43750898 -0.66134424  0.00854895]
 [-0.785534    1.10417433 -1.68405999]
 [ 1.86647138  0.1892932  -0.35340998]]
Z =  -0.5335726694973995
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.          0.86888616  0.07782113]
 [ 0.         -0.64691669  0.04366899]]
weights =  [[-0.18657899  0.28267571 -0.70134443]
 [-0.11598519  0.31027229 -0.53223402]
 [-0.76730983  0.69336623 -0.27584606]]
Z =  -0.692366689178931
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.          0.75041164  0.61838026]
 [ 0.          0.90148689 -0.22631424]]
weights =  [[ 0.58591043  1.12232832  0.35249436]
 [ 0.79452824  0.35016716 -0.59384307]
 [ 0.63019567  1.76041518  1.03298378]]
Z =  1.8931146167893287
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.52946532  0.23249456  1.0388246 ]
 [ 2.52832571  1.33145711 -1.27255876]]
weights =  [[-0.27909772 -2.03720123  0.58464661]
 [-1.30653407  0.84086156 -0.0693287 ]
 [ 0.52887975  0.63658341  0.85328219]]
Z =  -0.8681898752421002
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.13770121  0.68255141  2.18697965]
 [-0.24863478 -0.28730786  0.31354772]]
weights =  [[-0.43750898 -0.66134424  0.00854895]
 [-0.785534    1.10417433 -1.68405999]
 [ 1.86647138  0.1892932  -0.35340998]]
Z =  -3.585016350751383
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.07782113 -0.31011677  0.44136444]
 [ 0.04366899  0.68006984  0.50318481]]
weights =  [[-0.18657899  0.28267571 -0.70134443]
 [-0.11598519  0.31027229 -0.53223402]
 [-0.76730983  0.69336623 -0.27584606]]
Z =  -0.5008704811136218
a_slice_prev =  [[ 0.          0.          0.        ]
 [ 0.61838026 -2.43483776 -0.10015523]
 [-0.22631424 -0.3198016   1.29322588]]
weights =  [[ 0.58591043  1.12232832  0.35249436]
 [ 0.79452824  0.35016716 -0.59384307]
 [ 0.63019567  1.76041518  1.03298378]]
Z =  0.9728260287640423
Z's mean =
 0.00667398398459088
Z[0,2,1] =
 [0. 0. 0. 0. 0. 0. 0. 0.]
cache_conv[0][1][2][3] =
 [-1.1191154   1.9560789  -0.3264995  -1.34267579]
First Test: Z's mean is incorrect. Expected: 0.5511276474566768 
Your output: 0.00667398398459088 . Make sure you include stride in your calculation

First Test: Z[0,2,1] is incorrect. Expected: [-2.17796037, 8.07171329, -0.5772704, 3.36286738, 4.48113645, -2.89198428, 10.99288867, 3.03171932] 
Your output: [0. 0. 0. 0. 0. 0. 0. 0.] Make sure you include stride in your calculation

The strategy of using print statements for debugging is great. Here’s my version of the debugging output from that function:

stride 2 pad 1
New dimensions = 3 by 4
Shape Z = (2, 3, 4, 8)
Shape A_prev = (2, 5, 7, 4)
Shape A_prev_pad = (2, 7, 9, 4)
Z[0,0,0,0] = -2.651123629553914
Z[1,2,3,7] = 0.4427056509973153
Z's mean =
 0.5511276474566768
Z[0,2,1] =
 [-2.17796037  8.07171329 -0.5772704   3.36286738  4.48113645 -2.89198428
 10.99288867  3.03171932]
cache_conv[0][1][2][3] =
 [-1.1191154   1.9560789  -0.3264995  -1.34267579]
First Test: All tests passed!
stride 1 pad 3
New dimensions = 9 by 11
Shape Z = (2, 9, 11, 8)
Shape A_prev = (2, 5, 7, 4)
Shape A_prev_pad = (2, 11, 13, 4)
Z[0,0,0,0] = 1.4306973717089302
Z[1,8,10,7] = -0.6695027738712113
stride 2 pad 0
New dimensions = 2 by 3
Shape Z = (2, 2, 3, 8)
Shape A_prev = (2, 5, 7, 4)
Shape A_prev_pad = (2, 5, 7, 4)
Z[0,0,0,0] = 8.430161780192094
Z[1,1,2,7] = -0.2674960203423288
stride 1 pad 6
New dimensions = 13 by 15
Shape Z = (2, 13, 15, 8)
Shape A_prev = (2, 5, 7, 4)
Shape A_prev_pad = (2, 17, 19, 4)
Z[0,0,0,0] = 0.5619706599772282
Z[1,12,14,7] = -1.622674822605305
Second Test: All tests passed!

So the shapes of your inputs and outputs are correct, but the values of Z are not. The fact that Z[0,2,1] ends up as all zeros is the big clue: that means you are skipping some elements of the output. The key thing to realize is that the striding happens in the input space (“prev”), right? We have to touch every point in the output space: it’s only in the input space that we skip things. Check the logic in your loops over h and w and see where you are including the stride value.

Hi Paul,

I really appreciate your help! I believe things are better and I have redone the stride calculations, but still something is wrong. Here is my code. I don’t want you to tell me exactly what is wrong, but if you can give me another hint, that would be great!

{moderator edit - solution code removed}

As I mentioned in my previous post, the loops here are over the output space, not the input space. That is not true for some of your loops.

Also think about how many dimensions W has.

In the video before the exercise it shows the iterations being done horizontally and vertically before moving to the next filter, but the loops in the suggested code seem to have the filter iterations done first. Am I interpreting that correctly? Also, I see that I had the wrong range (n_C_prev instead of n_C) in the c loop. I am still working on the weights question . . . .

Yes, it was the loop over the channels that was incorrect. Glad to hear you found that issue.

For the weights, note that the dimensions of W are:

f x f x nC_{in} x nC_{out}

So how does that map to how you are handling W in your code?

Oh, sorry, I forgot to respond to the first part of your question. Yes, you’re right that in the lectures, he seems to do the loop over the output channels on the “outside”, but the way they suggest doing it in the template code is to have the loop over the channels be the innermost loop. The order of the loops doesn’t really matter, as long as you cover everything.

Well, it’s really the loop over the samples that’s the outer loop in both cases, but the above description covers the inner three loops over height, width and channels.

I saw that and changed my weights assignment to W[:, :, n_c_prev, c] but I am still getting wrong results.

That just selects one previous channel from the filter, right? If you think about it, that’s not what you want. In fact, you would always be ignoring the first 3 channels and depending on “broadcasting” to do the computation.

Finally got it! I was also only slicing one channel from a_prev. Thanks so much for your help!

3 Likes