How small should we reduce the image in CNN?

Throughout the course, the images are reduced to (7,7) after passing through several layers of CNN. My question is “As a rule of thumbs, how small should we reduce the image to?” Thanks ahead.

Hi @wokee

Welcome to our Community! Thanks for reaching out. We are here to help you.

That is an excellent question; it was one of the first questions that I did when I started to work on creating architecture for neural networks. However, there is no “right” answer to this question; there are many points of view on the size of the final image after pooling and convolutional layers.

It depends on the application and image size. There are no rules for selecting how many convolutional and pooling layers you should add to have the best performance or to have a specific final size of an image. It would help if you tried so many options, like increasing the depth of the network to increase the accuracy or decreasing the depth of the network, adding or removing some convolutional to have a bigger or smaller image.

Personally, I always try to have more than (7x7) image size before the fully connected layers, but of course, it depends on the application.

I like this article efficient-way-to-build-neural-network-architectures
from Shashank Ramesh, that told us about hyperparameters and gave us some tips on that. You should not focus on the image size before the fully connected layer, the more important configuration is select the hyperparameters.

Hopefully, help :muscle:

Hi, @wokee!

A part of the knowledge in Deep Learning has come from “just giving it a try”. With convolutions, you can reduce the image dimensions as much as you want (until, obviously, one pixel per channel). As @adonaivera said, it really depends on the kind of data you are using.

1 Like