Over 90% accuracy but wrong predictions

esssyjr · April 12, 2024, 9:13am

Good day everyone
I am working on an eight class image classification model, I have train the model and gotten over 90% accuracy for the train and validation set. But, when I visualize the predictions, I found out all what it is giving is wrong. The predictions are wrong . Who have similar experience please? What might the potential issue?
Thank you all

balaji.ambresh · April 12, 2024, 9:26am

Assuming that no coding errors exist, does this help?

gent.spah · April 12, 2024, 9:34am

Also check your labels and what the model is learning, which part of the image is really learning from! Maybe not the one you want to!

esssyjr · April 12, 2024, 9:52am

Sure, my dataset is imbalanced among the classes. what I assume is that, the model is correctly predict the class with most dataset. But it’s not doing so.

esssyjr · April 12, 2024, 9:54am

Please explain further your second statement “what part of the image it is learning from” this seems new to me. Or please refer me to a resource to learn from.
Thanks a lot sir

balaji.ambresh · April 12, 2024, 10:31am

What’s the train / val set label distribution?

esssyjr · April 12, 2024, 10:36am

80%-20%

balaji.ambresh · April 12, 2024, 10:42am

You have 8 label classes. I didn’t ask you for train / val set split ratio.

What’s the label count per class on train set and validation set? Here’s an example:

>>> labels = [0, 1, 1, 2, 1, 0, 0, 0]
>>> import numpy as np
>>> np.unique(labels, return_counts=True)
(array([0, 1, 2]), array([4, 3, 1]))

Please report 2 such entries (1 for train set and another for validation set)

esssyjr · April 12, 2024, 4:15pm

Ohhhh I got you

Train_dataset:
[5,10,20,50,100,200,500,1000],
[174,167,88,278,228,321,16,371]
Test_dataset:
[5,10,20,50,100,200,500,1000],[34,37,13,63,45,59,6,67]

balaji.ambresh · April 12, 2024, 5:25pm

Based on the label distribution, the assumption below (quoted text) is wrong since the decision boundary is learnt based on all data points belonging the multiple classes which are imbalanced but not extremely skewed in this case:

If the labels were like {class0: 9990, class1: 10}, one can hope that the model will get class0 right during prediction phase.

esssyjr · April 12, 2024, 5:40pm

Okay, thank you very much sir.
Now you suggest I should increase the dataset and also try to balance it ?

balaji.ambresh · April 12, 2024, 6:12pm

You’re welcome. Class weighting is a good place to start. Do try other techniques that you and @gent.spah have listed as well.

gent.spah · April 13, 2024, 5:58am

Yes, Tensor flow Advanced Techniques Specialization explains this by creating heat maps to show which parts of the image the model is learning from. Basically, the model instead of learning from the object you want in the image, is learning from something else in that image…

Rorisang · April 15, 2024, 2:52pm

@esssyjr , @gent.spah could be right.
Maybe the model is classifying using an ‘unexpected’ feature(s) in the images. Look at the link below:
HuskyVSWolfClassification.

Also, try to see if there is no data leakage:
DataLeakageExampl.

Could your model be overfitting?
OverUnderFitting

dingowhiz · April 16, 2024, 9:31am

is it overfitted?

Topic		Replies	Views
Help Me With NN Model Accuracy AI Discussions ai-discussions , project	33	319	January 3, 2025
Incorrect Labelled Data Structuring Machine Learning Projects coursera-platform	1	555	October 23, 2021
OMG, L_layer_model classified me as cat Neural Networks and Deep Learning coursera-platform	6	651	February 4, 2023
Is %97 accuracy too much for a simple model? AI Discussions ai-discussions , model-customization , project , ai-question	8	506	March 3, 2024
Week 3 Assignment - help with interpreting results Natural Language Processing in TensorFlow	2	339	December 22, 2022

Over 90% accuracy but wrong predictions

Related topics