Help in interpreting validation accuracy

Hi,
I have a problem to understand the behavior of my model:

  • My small model reached directly a validation accuracy of 0.8 (higher then train accuracy)

  • So I decided that it is enough to overfit the training data by increasing model size. The result is attached. I understand, I do overfit the training data, but why is the validation accuracy in all models in the first epochs higher then training accuracy? The validation dataset is symmetric and stratified, I would have rather expected to get a validation accuracy in the beginning in the range of 0.5?

Epoch 1/15
466/1125 [===========>…] - ETA: 4:17 - loss: 0.6797 - accuracy: 0.5911

/home/mirko/anaconda3/envs/tens1/lib/python3.9/site-packages/PIL/TiffImagePlugin.py:845: UserWarning: Truncated File Read
warnings.warn(str(msg))

1125/1125 [==============================] - 462s 410ms/step - loss: 0.6393 - accuracy: 0.6368 - val_loss: 0.5738 - val_accuracy: 0.6980
Epoch 2/15
1125/1125 [==============================] - 475s 422ms/step - loss: 0.4977 - accuracy: 0.7565 - val_loss: 0.4500 - val_accuracy: 0.7912
Epoch 3/15
1125/1125 [==============================] - 512s 455ms/step - loss: 0.3898 - accuracy: 0.8237 - val_loss: 0.3862 - val_accuracy: 0.8308
Epoch 4/15
1125/1125 [==============================] - 466s 414ms/step - loss: 0.2720 - accuracy: 0.8835 - val_loss: 0.4553 - val_accuracy: 0.8032
Epoch 5/15
1125/1125 [==============================] - 464s 412ms/step - loss: 0.1270 - accuracy: 0.9525 - val_loss: 0.5589 - val_accuracy: 0.8352
Epoch 6/15
1125/1125 [==============================] - 465s 414ms/step - loss: 0.0490 - accuracy: 0.9844 - val_loss: 0.6640 - val_accuracy: 0.8216
Epoch 7/15
1125/1125 [==============================] - 467s 415ms/step - loss: 0.0393 - accuracy: 0.9878 - val_loss: 0.9330 - val_accuracy: 0.8204
Epoch 8/15
1125/1125 [==============================] - 461s 409ms/step - loss: 0.0291 - accuracy: 0.9906 - val_loss: 1.0848 - val_accuracy: 0.8216
Epoch 9/15
1125/1125 [==============================] - 467s 415ms/step - loss: 0.0263 - accuracy: 0.9920 - val_loss: 1.5080 - val_accuracy: 0.8008
Epoch 10/15
1125/1125 [==============================] - 488s 434ms/step - loss: 0.0403 - accuracy: 0.9886 - val_loss: 1.0494 - val_accuracy: 0.8228
Epoch 11/15
1125/1125 [==============================] - 446s 396ms/step - loss: 0.0156 - accuracy: 0.9969 - val_loss: 1.4513 - val_accuracy: 0.8168
Epoch 12/15
1125/1125 [==============================] - 484s 430ms/step - loss: 0.0116 - accuracy: 0.9969 - val_loss: 1.3318 - val_accuracy: 0.8204
Epoch 13/15
1125/1125 [==============================] - 471s 419ms/step - loss: 0.0265 - accuracy: 0.9922 - val_loss: 1.1205 - val_accuracy: 0.8100
Epoch 14/15
1125/1125 [==============================] - 452s 402ms/step - loss: 0.0226 - accuracy: 0.9943 - val_loss: 1.2086 - val_accuracy: 0.8176
Epoch 15/15
1125/1125 [==============================] - 450s 400ms/step - loss: 0.0259 - accuracy: 0.9932 - val_loss: 1.1753 - val_accuracy: 0.8004

It is normal for a model to have a higher validation accuracy than training accuracy in the first few epochs, because the model weights are randomly initialized and the model hasn’t seen enough data to overfit yet. In the first epochs, the model is likely to make correct predictions on the validation data by chance, which would lead to a higher validation accuracy.

In your case, since the validation dataset is symmetric and stratified, it is likely that the initial weights of your model were such that it was able to make correct predictions on a significant portion of the validation data. As the model sees more data and trains for more epochs, it will start to overfit the training data and its validation accuracy will decrease.

It is also possible that your model is already overfitting the training data, even in the first few epochs. Overfitting can happen when a model is too complex for the amount of training data available, and it can lead to a higher training accuracy and a lower validation accuracy. In this case, you may want to try using regularization techniques or reducing the model size to prevent overfitting and improve the generalization of your model.

Dear Juan,
thank you for your answer.
I was not aware that it is normal for the validation accuracy to be higher in the beginning, so I do not need to worry, that I made a mistake with the datasets, thanks!

1 Like