I used this image as an image to try prediction on the logistic regression that we made in assignment in week 2 and the prediction detected that there is a cat in this image.
Is that because of the 70% accuracy that we made? I don’t know why it predicted it as a cat.
There are two fundamental problems here:
- Logistic Regression is just not powerful enough to do a very good job on this type of image recognition problem.
- The training dataset is way too small to give a good result that generalizes to arbitrary input images. We’ll use the same dataset in Week 4 with a much more powerful 4 layer Neural Network and it will do much better on the training and test data, but still won’t generalize all that well.
For a real system on this type of recognition task, you would typically have O(10^5) or even O(10^6) training sample images to get a system that works well. It’s actually pretty amazing that it works as well as it does with this small a training set. I think the training and test data is pretty carefully curated to give this good a result.
The other perhaps more minor point to make is that you are showing the full resolution image, but that is not what the algorithm is actually “seeing”. Notice that the algorithm only handles 64 x 64 images. To be fair, you should show us the image after the “downsampling” logic that they provide you in the “Test with Your Own Image” cell. But I grant you that it really doesn’t look anything like a cat.
Thank you for replying to me. But I want to ask about the amount of data that you used as an example O(105) or even O(106).
These numbers, are they related to the size of the network as an example or it’s generally that we need this size of data to get good performance?
Is it the same for machine learning and deep learning??
I know the chart that explain the performance vs the data size that Andrew mentioned in week 1 but isn’t there other way to get even good performance above 85% accuracy with 2000 dataset as an example??
This is the image after downsampling
I was talking about the size of the training data set that you need in order to get good results on an image recognition task. That is the number of samples. The general rule, as Prof Ng discussed in the lectures, is that “more data is better” up to some point of diminishing returns, So 2000 training images will be much better than 209 training images as we have here. Even with 2000 images, we probably won’t get 85% accuracy from Logistic regression, but we might very well get 85% accuracy from a real Neural Network as in Week 4. But for a ‘real world’ solution, 85% still is probably not going to be good enough.
Thanks for the downsampled image. It’s definitely a lot less resolution than the original. Maybe you could convince yourself some of those shapes look like a cat’s ear …