Improving the ResNet50 Model

Hi,
As I reached the end of W2A1, I came accross the point where I uploaded my image and tried to make a classification:


It should have been 5.
The Hint says that it might be related to some distributions. Could it be a problem with the relatively small dataset we used for training (only 1080 as opposed to hundreds of thousands to millions required for something like this)? Also I feel that the images are also having very low resolution. So could that also impact the performance?
One more thing I noticed is that before this example, I took another picture of my hand in a dim environment (a dim background)
image
For this, the class was 0
So there seems to be much larger error. Does this indicate that illumination and background also has a role to play, and that more illumination or more illuminated background can result in better performance? Plus it seemed a little counterintuitive. I thought that the plain green background should make it easy for the algorithm and give correct classification whereas the window-view of city background should make it harder for the algorithm. Can this also be explained please?

Yes it could be, because the model might not have trained on similar data!

Yes, low resolution has less information than high resolution!

Yes, one should understand that the model learns from the entire image not just what we are interested on, surprisingly enough it might even learn from what you don’t want to, and thats why you need a large distribution of images and scenarios!

The main point is that the data used to test the model should be from similar distribution on which it was trained and also large enough to include every possible scenario.

Oh I see. So we should have

  1. Million+ images for training
  2. High resolution of images (1 MP might be good)
  3. The training images must cover all possible range of illumination intensity, relative size, backgrounds, etc.

Ideally yes!