Good day everyone
I am working on an eight class image classification model, I have train the model and gotten over 90% accuracy for the train and validation set. But, when I visualize the predictions, I found out all what it is giving is wrong. The predictions are wrong . Who have similar experience please? What might the potential issue?
Thank you all
Also check your labels and what the model is learning, which part of the image is really learning from! Maybe not the one you want to!
Sure, my dataset is imbalanced among the classes. what I assume is that, the model is correctly predict the class with most dataset. But itâs not doing so.
Please explain further your second statement âwhat part of the image it is learning fromâ this seems new to me. Or please refer me to a resource to learn from.
Thanks a lot sir
Whatâs the train / val set label distribution?
80%-20%
You have 8 label classes. I didnât ask you for train / val set split ratio.
Whatâs the label count per class on train set and validation set? Hereâs an example:
>>> labels = [0, 1, 1, 2, 1, 0, 0, 0]
>>> import numpy as np
>>> np.unique(labels, return_counts=True)
(array([0, 1, 2]), array([4, 3, 1]))
Please report 2 such entries (1 for train set and another for validation set)
Ohhhh I got you
Train_dataset:
[5,10,20,50,100,200,500,1000],
[174,167,88,278,228,321,16,371]
Test_dataset:
[5,10,20,50,100,200,500,1000],[34,37,13,63,45,59,6,67]
Based on the label distribution, the assumption below (quoted text) is wrong since the decision boundary is learnt based on all data points belonging the multiple classes which are imbalanced but not extremely skewed in this case:
If the labels were like {class0: 9990, class1: 10}, one can hope that the model will get class0 right during prediction phase.
Okay, thank you very much sir.
Now you suggest I should increase the dataset and also try to balance it ?
Youâre welcome. Class weighting is a good place to start. Do try other techniques that you and @gent.spah have listed as well.
Yes, Tensor flow Advanced Techniques Specialization explains this by creating heat maps to show which parts of the image the model is learning from. Basically, the model instead of learning from the object you want in the image, is learning from something else in that imageâŚ
@esssyjr , @gent.spah could be right.
Maybe the model is classifying using an âunexpectedâ feature(s) in the images. Look at the link below:
HuskyVSWolfClassification.
Also, try to see if there is no data leakage:
DataLeakageExampl.
Could your model be overfitting?
OverUnderFitting
is it overfitted?