In the week 2 Alpaca transfer learning, it strikes me as curious that throughout training, performance on the cross-validation data is consistently better than on training data by both accuracy and cross-entropy. I’m not used to seeing that behavior. Would anyone care to comment? Is this a result of augmenting the training but not the validation set?
When images in cross-validation set are a lot closer to the images the model predicts correctly in the training set, your results can be observed.
The reason for augmenting images is to get around overfitting the training set.
One way for you to explore further is to remove ImageAugmentation parameters other than rescale and check the behavior.
The other point to make is that the training here is non-deterministic. They don’t set any random seeds here, but I tried that and still get different results everytime. FWIW I don’t not see the case that the validation accuracy is > training accuracy in the experiments I tried. Here’s the training before fine tuning:
Actually now that I look at those, it’s a little suspicious: the validation accuracy is mostly constant at 0.3846 and the training accuracy bounces around (not monotonically increasing). Hmmmm. I would venture that this is not typical behavior and that further investigation and perhaps hyperparameter tuning is warranted here.
Sorry, but they have asked the mentors to search for unanswered questions and make sure they get resolved to make the forum stats look better. Your question is a good one and I apologize that no-one responded when you first asked it. There is still value in answering old questions, since the history on the forums has ongoing value. Unlike the Coursera forums (the search engine of which is disfunctional), people are able to find pre-existing posts on Discourse and derive value from them.