Using 1x1 conv with the same number of channels of the input image act like a non linearly and will help in learning the non linearity operator.

I do not get how 1x1 conv will help in learning the non linearity operator.

Thanks

Non-linearity depends on the output activation function. If you see here, `activation=None`

which means that the output activation function is linear. When you set it to a non-linear function like `relu`

, non-linearity comes into play.

Output of a convolution layer has number of channels equal to the number of conv filters used.

A 1x1 convolution helps changing the number of channels of the output while maintaining the same height and width of the input.