Course 4 Week 2 Programming Assignment 1 Optional Question

ninakka · August 13, 2024, 7:14pm

Did anyone get reasonably correct results on their own images with the model that classifies the images of numbers in the sign language.

The outputs of the model on my own images are completely incorrect. Of the six images I tried ALL were incorrect. I even tried rotated my images clockwise by 90 degrees but still the same result. Not sure if I am doing something wrong. It does not seem to be lighting or clarity of picture as mentioned in the note of the question. That seems surprising for a model that claims to have greater than 70% accuracy.

TMosh · August 13, 2024, 7:26pm

Centering, lighting, scaling, color tones, and overall image characteristics are important when providing your own images.

paulinpaloalto · August 13, 2024, 7:30pm

Note that this is Week 1 Assignment 2, not Week 2 Assignment 1. I will edit the title using the little “edit pencil” to correct that and the tag.

There are several possibilities here. The first thing to check is that you performed the same scaling on your sample input images as was done with the training data, as shown in this cell that was given to you:

X_train = X_train_orig/255.
X_test = X_test_orig/255.
Y_train = convert_to_one_hot(Y_train_orig, 6).T
Y_test = convert_to_one_hot(Y_test_orig, 6).T

If you did that correctly and still got no correct predictions, then the other more general possibility is that the dataset here is not large enough to get a generalizable model for an image problem like this. We have only 1080 training samples.

Also consider Tom’s points about how similar the composition of your sample images is to the ones in the training set. Print a few of both and visually compare them.

ninakka · August 13, 2024, 7:54pm

Thank you @paulinpaloalto and @TMosh for your responses.

I rechecked and I am actually asking about Week 2 Assignment 2 only. This is for building a ResNet and train it using the SIGNS dataset.

I will try to compare and match up the scaling of my samples with the training set.

paulinpaloalto · August 13, 2024, 8:06pm

Oh, sorry, you’re right that the Residual Networks assignment also uses the same “signs” dataset as the W1 A2 assignment does. Residual Networks is W2 A1. I restored the title to the way you had it. Sorry again.

But the points that Tom and I made are still applicable. Let us know what you think after considering those points.

paulinpaloalto · August 13, 2024, 8:14pm

Just refreshing my memory on the Residual Networks assignment, the model we train in the notebook for 10 epochs gets 95% training accuracy and 80% test accuracy. Notice that they also give us a pre-trained model that produces 95% test accuracy. They don’t give very much detail about how that pre-trained model was trained. E.g. did it use a larger training set than the 1080 sample version we used? Or did they just run the training for a lot more epochs?

In your experiments, were you using the pre-trained model or the model that you actually trained in the notebook for 10 epochs?

ninakka · August 13, 2024, 8:24pm

Looks like this optional exercise uses the pre_trained_model. The code was already there. I was just updating the input image name and running the code.
prediction = pre_trained_model.predict(x2)
The code seems to be setting the target image size as 64, 64. May be my aspect ratio is not correct as it was uploaded AS IS taken from a mobile phone.

paulinpaloalto · August 13, 2024, 8:42pm

Well, that given code block displays the image in its 64 x 64 reduced form. What does that look like with your cellphone created image? You could paste a screenshot here. In principle, one wouldn’t think that aspect ratio would make that much difference in recognizing shapes that are pretty distinctive as in this case, but maybe my intuition there is just incorrect.

paulinpaloalto · August 13, 2024, 8:47pm

You could also perform some more direct experiments about aspect ratio: at least on an iPhone, you can crop an image in the Photo App. So take a picture of your hand and then crop the image so that it is essentially square, but still contains the complete image of your hand. Then the downsampling to 64 x 64 shouldn’t affect the aspect ratio. Does that make a difference?

Science!

ninakka · August 13, 2024, 9:52pm

Supposed to be 0; Classified as 2 (not sure why?)

nn_0_64x64reduced

Original image: 1183x1183 resolution

paulinpaloalto · August 13, 2024, 9:56pm

Interesting. I would say that your picture looks very similar to the ones they show from the training dataset.

Do you have more examples of misclassifications? Is there any pattern? E.g. everything gets classified as 2?

ninakka · August 13, 2024, 10:40pm

I can provide examples, if you would like, but every one of them is getting misclassified to different classes (not necessarily to 2). Either a problem with the model (it is pre trained though so nothing I could have done wrong there!), or something else wrong I am doing.

ninakka · August 13, 2024, 11:30pm

0 classified as 2 (incorrect) already showed before.
1 classified as 1 (correct)

2 classified as 1 (incorrect)

3 classified as 2 (incorrect)

4 classified as 2 (incorrect)

5 classified as 2 (incorrect)

paulinpaloalto · August 13, 2024, 11:55pm

Thanks very much for the detailed response. It looks like you did a great job with creating the square images.

My next thought was to try the pre-trained model with some of the actual training images and see how it does. Then I took a look under the “images” subdirectory and they give you 3 other images for 1, 3 and 5. I tried predicting them and (just like in your examples) only the sample for 1 gets predicted correctly.

But there also is something odd going on with how the image library is downsampling those images. I looked at them in their native state and they are getting rotated by the “load + downsample”. Not sure if that might be part of the problem. Do you see your images in the correct orientation when you run that prediction cell as I showed above?

ninakka · August 14, 2024, 10:18pm

Hello @paulinpaloalto , thank you taking a detailed look at the issue. In the beginning when I was uploading my images at 16:9 aspect ratio, (uploading as is from mobile) they were getting rotated (90 degrees counterclockwise) exactly like you have shown.

I tried to rotate that (16:9 aspect ratio) image 90 degrees clockwise (because the output seemed to be rotating 90 degress counterclockwise) to see if that helps. Interestingly, the image shown was still the same (now 180 degrees rotated counterclockwise).

When I tested on completely square (1:1) aspect ratio images there was no rotation. There was downsampling, which is expected. But the outputs were incorrect as I mentioned before. Out of 0 to 5 only 1 was classified correctly.

I am bit surprised as the validation script shows close to 95% accuracy. It would be interesting to see the test set that was used to arrive at this 95% accuracy.

TMosh · August 14, 2024, 10:49pm

You can display individual test set images like this:

ninakka · August 14, 2024, 11:26pm

I ran this code and the only output class I got was 0. There is definitely something wrong that I am not able to understand. This is with the pre_trained_model that is just loaded. I get the same “Class = 0” result if I use the model that I built as part of the assignment and got 100% grade.

print(Y_test.shape)
for i in range(len(X_test)):
    img = X_test[i,:,:,:]
    # imshow(img)
    # show()
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)
    x = x/255.0
    x2 = x 
    # print('Input image shape:', x.shape)
    prediction = pre_trained_model.predict(x2, verbose=0)
    OR
    prediction = model.predict(x2, verbose=0)
    # print("Class prediction vector [p(0), p(1), p(2), p(3), p(4), p(5)] = ", prediction)
    print("Sample #" + str(i+1) +" = Class: " + str(np.argmax(prediction)) + "; Expected Y: " + str(Y_test[i]))```

TMosh · August 15, 2024, 12:00am

Try just using “p = model(X_test)”, without a for-loop or any of your pre-processing, and then use argmax on axis = 1.

Topic		Replies	Views
Week2: Exercise 5 - Test on your own image (Optional/Ungraded) Convolutional Neural Networks coursera-platform	4	530	July 17, 2022
Week 2 assignment 1 resnets additional part Convolutional Neural Networks coursera-platform	1	552	June 17, 2021
Doubt in non-graded portion in week2 ResNets programming assignment Convolutional Neural Networks coursera-platform	1	590	May 14, 2022
Why does my cnn model doesnt generalize well to unseen data? AI Discussions ai-discussions , project	12	143	December 4, 2024
Misclassification for own image Convolutional Neural Networks coursera-platform	3	548	August 13, 2021

Course 4 Week 2 Programming Assignment 1 Optional Question

Related topics