Image input size

Is it necessary to set the input value of a image to 1:1 value ( like input_size = (150,150). And are there any extra benefit of that ?

I don’t remember there are mathematical reasons, but more of pragmatic reasons.

Square images seem to be the “common ground” between portrait and landscape images. Otherwise we would have another set of parameters to tune: the ratio of images.

I agree with Yang that there is no mathematical reason for that, but the important point to realize is that most ML/DL algorithms need a consistent input format. You’re going to be training a model to process the images and they all need to be in a consistent form, meaning the same shape and the pixel values need to all be of the same type, e.g. greyscale values or RGB values scaled the same way (either 0 - 255, or scaled to be 0. to 1. or scaled -1. to 1. are the common methods).

Most digital cameras do not produce square images, of course. If you have multiple sources of inputs that come in different shapes, you can use an image processing library to preprocess all your images into the common form that you select by some combination of cropping and rescaling. You want that to be a completely separate step that you do once before you start your training, because training is computationally expensive already and you don’t want to make it more expensive by doing the image resizing every time.

There is no need for the images to be square, but notice that in all the applications we see here in the courses, the images have significantly lower resolution than any digital camera typically produces these days. That’s because you need large training sets and the storage and cpu power needed to run the training is an important constraint. So one of the early design decisions you need to make is to find the “goldilocks” image resolution that has enough detail to detect whatever you need to distinguish while still being relative small compared to the typical multi-megabyte size of today’s camera images. The aspect ratio does not need to be 1:1, but a fixed aspect ratio and your preprocessing will include downscaling the images to the resolution that you select.

1 Like

sorry for the late reply. Thanks a lot both of you for your clear explanation.

1 Like