In 2:29, how to understand the use of 1*1conv to enhance the number of channels? I’m sorry, this chapter is a little difficult for me. If the question is a little naive, please be patient and answer it. Mr. Ng’s explanation is a little difficult for me to understand.
this is the silde
Here is an overview of 1x1 convolution.
The input shape is (h, w, ch), and the number of 1x1 filters is “m”.
Each filter creates (h, w, 1) feature map. As we use “m” filters, then final output is (h, w, m).
So, 1x1 convolution is sometimes used to reduce the number of channels with setting “m” to be a small number. In the case of “Inception”, there are multiple paths which include 3x3 or 5x5 convolution. Those require computational power (and time). By using 1x1 convolution, the number of channel can be reduced. And, this means the computational power (and time) can be lower than before, which contributes to the overall performance.
Of course, this 1x1 convolution can be used to increase the number of channels with preparing the large number of filters. This is sometimes used for a bottle neck. A bottle neck used 1) 1x1 convolution with small number of filters to reduce the number of channel, 2) 3x3 or some other convolutions to create feature maps, and 3) 1x1 convolutions to put back the number of channels with the larger number of filters.
It’s very useful, actually.