Padding optimization

can i avoid training convolutions activations in zero padding zones? i guess the backprop over values in artificial ceros arent an ideal situation.
there is functions to avoid that? or really doesnt matter?
thanks.

Hi, @z009LL !

Padding is always applied to the input of a convolutional layer and the activation function is computed with its output. Therefore, this output won’t have padding so you don’t need to worry about it.

The other thing to point out is that back propagation is not modifying the data values, right? It’s modifying the weight and bias values, meaning the components of the filters. That’s what we are learning. That is driven by the data, but it “is what it is” based on how forward propagation works and the derivatives. The point of padding is twofold: it can be used to maintain the shapes so that they don’t decrease too quickly as you move through the layers and it also means that the real data at the “edges” of the input at any given layer gets to have more of an effect on the results. Think about the very last row and column of pixels in the input: if you don’t have any padding, then they literally get included in only one output value in each channel, right? Whereas with padding, then multiple steps of the output are affected by the edge pixel. So maybe it actually makes the back propagation work better because it gets a more complete view of the input data … Just a thought, but I think I remember Prof Ng saying something about that in the lecture where he introduces padding. But I confess it’s been a while since I watched those early lectures in C4. Probably worth going back and refreshing my memory! :nerd_face:

1 Like