Why can 1 X 1 Conv (networks in networks) save computation cost?

Hey there,

I’m curious as to why 1x1 saves computational cost, although in the course we are using specific examples, inserting qualification bottlenecks to hard calculate the cost and comparing it to a normal Conv, but I really don’t understand what the mathematical basis behind it is?

Can any master explain this in a more general way?

thanks :grinning:

Hi Chris.X,

Maybe this explains?

1 Like