Inception motivation

So Andrew says in the video instead of having to choose between different filters etc. we can just do them all in a single layer and concatenate at the end. Why does this even make sense? How does this improve our model? We still have to have multiple layers

Yes, but the point is that you are effectively doing a bunch of different things in parallel in each layer. In Prof Ng’s example, maybe the 1 x 1 filter learns something different than the 3 x 3 and the 5 x 5 filters. In the normal “serial” ConvNet architecture, you are limited to one filter size per layer, so maybe you miss some things by constraining your choice in that way.

I think Prof Ng says all this in the lectures, so you might want to watch them again with the above ideas in mind.

1 Like