Hidden layers in inception net

how does intermediate computaion of output has regularization effect and solves overfitting?


And you should think of this as maybe just another detail of the inception

that’s worked.

But what is does is it helps to ensure that the features computed.

Even in the heading units, even at intermediate layers.

That they’re not too bad for protecting the output cause of a image.

And this appears to have a regularizing effect on the inception network and

helps prevent this network from overfitting.}}}}}}

1 Like

I think the picture below is all about “inception”. :slight_smile: This is from a paper, Going deeper with convolutions

As you know, a very deep network tends to overfit. It’s similar to a multivariate polynomial for fitting to sample data. In addition, a very deep network has a problem of vanishing gradients always.

The idea of “Inception” is to add “wider” layer from a simple question of “why we do not use multiple sizes of filters for a single layer ?” They, actually added multiple sizes of convolutional filters and also max pooling in a single layer as shown in (a) naive version. We can expect different outputs from each branch, which works for regularizations and also for keeping network activity larger (to avoid vanishing gradients).
Of course, this requires additional computational efforts. So, they add some dimension reduction functions (like 1x1 convolutions with small number of filters) as described in (b) Inception module with dimension reductions.

I think this is an essence of “Inception” and the reason why authors named this paper as “Going deeper with convolutions”.

Hope this helps.