Regarding Inception Model Architecture

I was reading the article for which the link is provided in the course: A Simple Guide to the Versions of the Inception Network | by Bharath Raj | Towards Data Science and while reading I came across a point that states → As stated before, deep neural networks are computationally expensive. To make it cheaper, the authors limit the number of input channels by adding an extra 1x1 convolution before the 3x3 and 5x5 convolutions. I am not able to wrap my mind around why 1x1 convolution is added and how does it help in reducing the overall computations required?

Here is a picture comparison from the article, it will help in understanding what I am asking, clearly.

1x1 convolution can be used to reduce the number of channels. Hepe this link helps.


Thank you so much, this is what I was looking for!