Using 1x1 conv with the same number of channels of the input image act like a non linearly and will help in learning the non linearity operator.
I do not get how 1x1 conv will help in learning the non linearity operator.
Thanks
Non-linearity depends on the output activation function. If you see here, activation=None
which means that the output activation function is linear. When you set it to a non-linear function like relu
, non-linearity comes into play.
Output of a convolution layer has number of channels equal to the number of conv filters used.
A 1x1 convolution helps changing the number of channels of the output while maintaining the same height and width of the input.