DLS course 4, week 2, Network in Network (1 x 1 Convolution)

siddwho819 · August 10, 2022, 3:02am

1 x 1 convolutions maintain nh and nw values while changing nc values. We can achieve the same using a convolution layer with “same” padding and the required number of filters. What difference does the activation part make in 1 x 1 convolution compared to when we take “same” padding conv. layer? What does adding the non-linearity do?

anon57530071 · August 10, 2022, 3:29am

The main purpose of 1x1 convolution is to reduce the computational requirement.

Large convolution like 3x3, 5x5 or more, requires huge computational power. Roughly, it is proportional to “height x width”. So, the computational power requirement is much smaller than larger filter with same padding. (1x1 convolution can even run on CPU.)

But, the advantage is its usage.
It is mostly used to reduce the number of channel “Before” large convolutions, which helps to reduce the computational requirements. Then, “After” large convolutions, 1x1 convolution can easily put back the channel number to the original size with small computational power.
This is called “bottleneck”, and used by several convolutional networks.

“Activation” is a kind of “additional bonus”. Of course, we can specify an activation function to 1x1 to increase a non-linearity, which also increases capability of the network.

Hope this helps.

siddwho819 · August 10, 2022, 4:37am

This clears it up. Thank you so much!

Topic		Replies	Views
Use of 1x1 convolution block Convolutional Neural Networks	5	343	October 22, 2023
Network in network Convolutional Neural Networks	4	538	May 23, 2021
1x1 convolution vs padding='same'? Convolutional Neural Networks	2	519	January 26, 2022
[Course 4] 1x1 conv Convolutional Neural Networks	1	489	August 6, 2022
Same padding vs 1*1 convolutions Convolutional Neural Networks	1	502	July 29, 2023

DLS course 4, week 2, Network in Network (1 x 1 Convolution)

Related topics