How does convolution to fewer channels work

What is the calculation done when going from an input of a higher number of channels to fewer channels in a one by one convolution?

if you have a in input of 28 x 28 x 48 (48 being the number of channels) and you convolute using a 1 x 1 x 16 filter, what exactly is the multiplication that happens that would result in a 1 x 1 x 16 output for each implementation of the filter?

I understand that this would be applied 28 x 28 times. But not clear on what happens on each application of the filter.

Hello @rama_Rao,

For 16 filters of kernel size 1 x 1 (we don’t say 1 x 1 x 16 in order not to mistake it as an array of such size) on an input of 28 x 28 x 48, each filter has a shape of 1 x 1 x 48. After such a filter is applied on the input, it produces a 28 x 28 x 1 output. Finally we stack all outputs from each of the filters to get the 28 x 28 x 16 final output.

You might create a simple input and a simple Conv2D layer with tensorflow to test and verify the idea.

Cheers,
Raymond