If the goal of the same padding and 1*1
convolutions is the same which is to maintain the same height and width of a convolution layer, then what is the difference between same padding and 1*1
convolutions and when to use each one?
Those topics are completely separate. “Same” padding can be used with any kind of convolutions in cases in which you do not want to reduce the height and width dimensions. But note that the preservation of h and w only happens when stride = 1
, once you switch to using TF/Keras to implement convolutions. Here’s a thread which discusses that a bit more.
1 x 1 Convolutions are a very particular kind of convolution that are used in certain cases. With 1 x 1 convolutions, the h and w dimensions are naturally preserved, so padding is not needed to achieve that. But the whole point of 1 x 1 convolutions is that there is no inclusion of “nearby” pixels in the convolution operation: it only operates across the input channels. Prof Ng has quite a bit of discussion about 1 x 1 convolutions in Week 2 of DLS Course 4. For example, you can start with this lecture that is specifically about 1 x 1 convolutions. But note that they also play a key role in the later discussions (also in Week 2) about MobilNet and “depthwise separable” convolutions. So please “hold that thought” and proceed through Week 2 and listen to all that Prof Ng says on those topics to understand the use cases for 1 x 1 convolutions.