I used the resnet 50 models on cipher 10 datasets having 10 classes of 32*32*3
. My train accuracy was 98% but when I tried to get predictions . All of them gave the same value of prediction.I don’t know what the reason is. Can someone help me out with this??
What do you mean by same value of prediction? Is it predicting always positive even when supposed to be negative?
Then you might be dealing with overfitting of the training dataset!
yes, it always predicts that the picture is of a truck but it isn’t. The training accuracy is 98% but when I test it on the train set, even then the model is predicting a truck again and again. So is it the problem of overfitting??
I cant understand you well, if even during training it is getting trained wrong then you have a problem with labels. Either overfiting or its probable that your model is not learning from the right part of the image and in this case you need more images in different backgrounds.
The images I used were of 32 *32 so is it possible that the resnet is soo deep that its dimensions shrank and no proper data was there to be learned in the end layers??
No I dont think its possible, the Resnet it will learn something for sure.
So what might be the problem…?
You can actually print the “summary()” of your model and that will show you the dimensions at every level. They showed us examples of how to do that in various of the assignments here. E.g. look in the Residual Networks notebook and search for the string “summary”.
It sounds like there is something the matter with your code. E.g. you are not doing the prediction logic correctly. This is not one of the assignments in the course, right? It sounds like you are exploring on your own. That’s the best way to learn, but along the way you need to be building up your own analytical skills. Solving this type of problem is fundamental to being able to be successful at this. Once you finish here and are trying to apply these ideas in a job setting, you won’t have Gent and me to fall back on. One thing to start with is whether you are using the loss functions in from_logits = True
mode or not. Is there a softmax
activation in your model as defined? If not and you use from_logits = True
mode on the loss function, then in “predict” mode you need to manually apply the softmax
. Note that Prof Ng always uses the mode in the course here that the outputs are logits, not activations. It’s more efficient, but you need to realize how to handle that.