Color shifting vs grayscale

Hi there,

Just watched the data augmentation video and was wondering about the color shifting.
If you apply color shifting you are in fact implying that color is less important to determine the correct image class. If this is true, why bother using 3 channels in the first place and why not use gray scale images instead?


Hi there,

No I dont think the point here is to eliminate color but rather to tell the model that even if the color changes the object doesn’t change for eg. shoes may change color but remain shoes. So color is still an important feature.

1 Like

Wouldn’t that be the same as saying: here’s my grayscale image. It might have different contrast/brightness, but in the end the shoe should still stay a shoe. Grayscale would decrease the network size I guess?

Maybe my question is more general: when is color actually needed? Not in case of detecting the shoes I guess :slight_smile:

But still the color of the pixels is an input into the convnet i.e. a feature, grey scale images are not the same as color images, let’s say have less inputs. The network may well learn even from grayscale but the more features are given to the convnet, it can use those to better predict. Basically the color has more features to be used to make better predictions.

1 Like

It’s a interesting point that whether you need colors or not is dependent on what your classification task is. If it’s recognizing shoes, then you’re probably right that greyscale images would suffice and save you some compute and storage costs. If the task is to recognize brown shoes or the difference between a red light and a green light, then it’s a different matter. :nerd_face: Of course in reality, that red light vs green light thing may be a red herring (pun intended): you can probably recognize traffic lights by their orientation. But you get the point …

There are some interesting follow on questions and experiments one could do here. E.g. suppose we went back to the cat classification exercise with the 4 layer network in C1 W4 and then did it again by converting all the input training and test data to greyscale images, how would that affect the classification accuracy? Of course that data might not be such a good basis for that experiment given that the datasets are way too small to get a realistic picture. FWIW here’s a thread about some experiments tweaking those small datasets and the amount of perturbation you can see. If we really wanted to do valid experiments to see if colors help or not, we should find some datasets of a more realistic size.