Inception module computation cost

We observe how splitting a 5x5 convolution into 2 steps by adding a 1x1 convolution pre step, reduces computation by a factor of 10. (120 million operations to 12million approximately).
While I understand how this works, by implementing this, we are also reducing the number of learnable parameters by a factor of 10.

With just the 5x5 CONV layer, 35 filters with an Input of 28x28x192 as in the video gives us (55192*32=153,000) learnable parameters.

By adding the 1x1 CONV layer(16 filters) to the same input followed by CONV 5x5, 32 filters gives us (1119216+553216=15,872) learnable parameters.

Why does this loss of parameters vs decrease in computation time, work out to our advantage.
(Trying humour out in explanation)
Hey, so I made our model 10x faster!
Really how?
It learns 10 times fewer parameters.

Yes, and it reduces overfitting along the way! Have a look here:

I had a slightly different question about this computation cost reduction.

It is great that the number of computations reduced, but do the two ways of doing the computations i.e. expensive vs cheap yield the same results.

Or is the premise that they yield equivalent results with the benefit of learning fewer parameters.