In both cases we see that number of channels reduce while keeping height and width same. Are there any advantages of one over the other? In the inception architecture lecture, Professor used 1x1 convolution to reduce the number of channels. If we use say a 5x5 filter with same padding does it affect the performance in any way?
with 1x1 convolution you are only merging the channels.
with a bigger size filter (say 5x5), you would also take into account the values of the adjacent pixels, so kinda you are mixing those information together. Even if the convolution would preserve most spatial information, but in comparison to the 1x1 case, you would still lose some resolution.
So I’d say it depends on what you want, and what layers you have after it. Say if you do a big max pooling just after, then maybe there won’t be a big difference
In the course Prof. Ng mentioned that you can shrink the number of channels/filter/kernels with 1x1. But you can do that with padding “same” and using less filters as well right?
I get, that if you are using padding “same”, since you are calculating with a bigger filter f.e. 3x3 you are altering the information. But also with 1x1 you are altering the information by mutliplying with a certain factor of the 1x1 filter right?
So in both ways, you are calculating new information and the channels will be the same right?.