Calculating computational cost

I don’t really understand how we compute the computational cost of an operation.
In the first layer here, the input size is 28x28x192, and the filter size is 1x1x192x16. So why wouldn’t the computation be 28x28x192x1x1x192x16, instead of what is written there? Same goes for the second conv layer.

Because that includes the factor of 192 twice. If you think about how filters work, you’ll see why that is incorrect:

We have 16 filters each of which it 1 x 1 x 192. We apply them with a stride of 1 to an input that is shaped 28 x 28 x 192. So for each filter at each position there will be 192 operations and there are 28 x 28 positions. So for one filter the number of computations is 28 * 28 * 192 and there are 16 filters.

Note that he’s not actually trying to compute the detailed number of additions and multiplications: he’s just trying to “scale” the level of operations. For each filter, you have muliplications, followed by additions and then +1 for adding the bias. But he’s not going into that level of detail here. Well, these operations will (we hope) all be performed by “vector units” of the cpu and the actual “operation” there is “multiply accumulate” which does the multiply and add in one step.

That makes sense now, thanks!