Hello, I want to make sure if I understand this right
unsymmetric padding leads to unsymmetric learning, is it right?
unsymmetric padding means choosing filter f = even number
So p = F-1/2 would be a fraction
Right! I’ll have to go back and find it in the lectures, but I do remember Prof Ng making the comment at some point in Week 1 that the common practice is to use odd filter sizes (e.g. f = 3, 5 or 7) because the math just works out more nicely. When the inputs are images, the height and width dimensions are typically an even number of pixels.
Hi Paulin, got one question-
I have an input size of (64 x 64 x 3) and I am using zero padding of size(3,3), filter size (7,7) , no of filters as 64 and stride (2,2)…wanted to know what would be the size of the output feature map?
Prof Ng gives the formula in the lectures:
n_{out} = \displaystyle \lfloor \frac {n_{in} + 2p - f}{s} \rfloor + 1
Those strange looking brackets are the notation for the “floor” mathematical function. So let’s plug in the numbers from your example:
n_{out} = \displaystyle \lfloor \frac {64 + 2 * 3 - 7}{2} \rfloor + 1 = \displaystyle \lfloor \frac {63}{2} \rfloor + 1 = \lfloor 31.5 \rfloor + 1 = 32
So with 64 filters, the output should end up as 32 x 32 x 64.
Of course notice that is just the first layer that you show there. BatchNorm and ReLU don’t affect the output size and then you have a pooling layer at the end which will further reduce the h and w dimensions, using the same formula as above. Pooling layers preserve the number of channels.
It was so puzzling for me to get a fraction value (63/2)…floor functions solves this…thanks for clarifying this doubt Paulin…cheers!!