In lab 1 of week 2, there is a step where we test the trained ResNet-50 using our own images. I found that the network is constantly miscategorising the image and in the lab, it asks what we can do to improve on that, but I found myself drawing a blank.
In a separate post from 3 years ago, it was suggested that we could train the network with our own images, but that basically means the network is not generalising well!
I tested images that look very close to the original and tried different lighting, but still no luck. What are some of the ideas that can be implemented to improve on this?
If I recall correctly, the issue with testing your own images is that the model in the notebook isn’t trained very completely, due to resource constraints on Coursera’s server.
To be sure, I’ll have to find some time to review that assignment in some detail. That may take a few days.
From what I see in the code, testing our own images is done using the pre-trained model, which is achieving almost 95% accuracy on the test data.
Since I’ve posted this, I tried to look at the test data and contrast it to my own image, and the only thing I can think of is the background colour, since everything else is similar. For example, the way I held my hand, the clarity of the image after resampling, etc.
Looking forward to your findings. In the meantime, I may try and change the background colour and see if that helps.
I don’t think its easy to say that when checked by the eye the images are the same! There might be subtle differences in the pixels, and this will make your images different from the distribution the model was trained on.
I would say if its possible to fine-tune the existing model with a set of your images and then check what performance you might get!