I’ve managed to solve the problem but the solution does not make sense to me. For the A_prev slice, i used the c in the loop but i feel like this only works because the n_C and n_C_prev have the same value. Shouldn’t the A_prev and output values (n_C_prev and n_C respectively) be a different value normally?
Right! The point is that pooling layers work differently than normal convolution layers. The pooling layers work “per channel”, so the number of output channels always equals the number of the input channels. Of course the height and width dimensions are typically less, since that’s the whole point of pooling. But you can do “same” padding with pooling layers as well, so you just have to do the usual computation:
n_{out} = \lfloor \displaystyle \frac {(n_{in} + 2p - f)}{s} \rfloor + 1
The typical case with pooling layers is that p = 0
and s = f
, but your code has to handle the “general” case in which that is not true.
Thank you for the explanation, the channel part makes sense. The concept is a little hard to grasp. Let me see if I have this right. So a normal convolution layer takes everything in that slice (all channels included) and convolves with the weights etc. While a pooling layer “pools” the slice but with the channels individually intact which is why the before and after layers are always equal?
Yes, before and after channels are equal. As you say, the “conv” layers have filters that match f x f x nc_prev
for each output channel, which is why the output channels are independent of the input channels in the case of a “conv” layer. But “pooling” layers operate on each channel separately, so the height and width change but the channels stay constant.
That makes sense. Thank you for your help!