Can the convolutional network architecture taught in course 4 detect inverted images, assuming it is only trained on image data that has only upright pictures?
As much as I understand, I don’t think those networks can. Please correct me if I am wrong.
Can you give a reference to where this is discussed in the course?
To the CNN, what would identify an inverted image in the training set, unless there are examples labeled as such?
How would a human identify an inverted image? That implies that the image contains some context that established an orientation.
For example, if part of the image is identifiable as water, then one would expect it to be at the bottom of an image.
For images of vehicles, one would expect that the tires would be on the bottom.
By upright, I mean images that are like this:
After training, can the model detect images like this?
Do you mean, can the model tell you if the image has the orientation a human expects?
Only if you label the images as upright or inverted.
The model has no idea which way a cluster of pixels that a human identifies as a cat should be oriented, unless you tell it via the labels.
If you only train the model on images that have the orientation that a human considers “normal”, then feed it an image from outside of that training domain, the model can only deduce that the image doesn’t meet the threshold for the classification from the training labels.
It won’t know “Oh, this must be an inverted image”. Because it doesn’t understand that concept unless you teach it via the labels.
Do you mean, can the model tell you if the image has the orientation a human expects?
No, I don’t expect the model to tell the orientation of the image. I wanted to ask, can the model detect the inverted cat image as a cat, given that the model was trained only on upright images of cats?
What I really want to know is can a convolutional network (taught in course 4 ) learn to classify an inverted image of a cat provided while training only upright images of cats were given to it?
Sorry I couldn’t explain better.
Even if all the training images are in the normal orientation, the training data may include cats in various different positions, so perhaps the convnet can learn to detect things like a cat’s tail or ears or whiskers, even if they aren’t sitting in a standard “portrait” position.
This is an experimental science. You could try some experiments by flipping the test images and see how that affects the prediction accuracy.
Probably not.
Maybe you could consider an inverted image as one that has been rotated by 180 degrees. If the training set did not include other images with large rotations, the CNN will probably not have learned that rotation can be ignored in making a prediction.
This sounds like a very useful experiment you can perform. Please report back your results.
If you remember, Andrew used an example of vertical line detector to explain the basics of convolutional filter. If you watch that part again, you could think about what will happen if a vertical detector is given a horizontal line. By extension, you may argue if, in principle, such convolutional filter, the backbone of CNN, is rotational invariant.
There are quite some researches into “rotational invariant CNN” which I believe worth time searching and reading, if you are interested.
Cheers,
Raymond