Input Layer for a Conv Net

Hi,

Does any input for a Convolutional Network should be of a dimension of n x n x c where c <= 3 (RGB)? Any edge cases?

A Convolutional Network is a general purpose architecture. There is no fixed rule for the size of the various dimensions, including how many channels the input has. Not all ConvNets are used for processing images necessarily. Even within image processing applications, there are lots of different types of images. Greyscale images only have one input channel, but some image types include 4 channels (e.g. PNG images can include an Alpha channel). In medical imaging, some of them are even “volumetric” meaning that they have 3 spatial dimensions. I’ve never dealt with satellite imagery, so I don’t know whether that introduces new image types with differing number of channels.

To add to Paul Sir’s detailed explanation, in the case of satellite imagery, there is something known as multispectral imaging, which consists of 4-20 color channels. Another pretty interesting thing in satellite imagery is hyperspectral imaging, which consists of more than 20 channels, each channel representing a band of spectral data for each pixel.

Even if we keep satellite imagery aside, a great example can be picked from GANs. When we train a conditional GAN, in which let’s say that we have 10 classes, before we feed the inputs to the discriminator, we apply one-hot encoding to the class information, and append the one-hot encoded vectors as channels to the images, as can be seen in the below image. Don’t worry, if none of this makes sense to you. It’s just something that I thought is worth mentioning.

Regards,
Elemento

1 Like