Training accuracy not smoothly increasing

Is it okay if the training accuracy oscillates up and down during training, I am expecting it to increase smoothly as long as the model architecture is big enough to fit the data or it would just decrease if it is underfitting the data. but why is it getting up and down randomly?

code:
[snippet removed by mentor]

  • code involves augmenting data with transformations other than rescale.
  • model doesn’t contain a dropout layer
  • learning rate is of the other of e-4

result:

Found 2000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Epoch 1/20
100/100 - 22s - loss: 0.6925 - accuracy: 0.5175 - val_loss: 0.6829 - val_accuracy: 0.5220 - 22s/epoch - 216ms/step
Epoch 2/20
100/100 - 19s - loss: 0.6793 - accuracy: 0.5580 - val_loss: 0.6613 - val_accuracy: 0.6130 - 19s/epoch - 187ms/step
Epoch 3/20
100/100 - 18s - loss: 0.6730 - accuracy: 0.5765 - val_loss: 0.6650 - val_accuracy: 0.5990 - 18s/epoch - 182ms/step
Epoch 4/20
100/100 - 19s - loss: 0.6635 - accuracy: 0.5955 - val_loss: 0.7331 - val_accuracy: 0.5180 - 19s/epoch - 188ms/step
Epoch 5/20
100/100 - 18s - loss: 0.6562 - accuracy: 0.6055 - val_loss: 0.6222 - val_accuracy: 0.6540 - 18s/epoch - 176ms/step
Epoch 6/20
100/100 - 18s - loss: 0.6331 - accuracy: 0.6465 - val_loss: 0.6163 - val_accuracy: 0.6650 - 18s/epoch - 176ms/step
Epoch 7/20
100/100 - 19s - loss: 0.6301 - accuracy: 0.6420 - val_loss: 0.6044 - val_accuracy: 0.6760 - 19s/epoch - 186ms/step
Epoch 8/20
100/100 - 20s - loss: 0.6176 - accuracy: 0.6555 - val_loss: 0.5835 - val_accuracy: 0.6850 - 20s/epoch - 201ms/step
Epoch 9/20
100/100 - 18s - loss: 0.6096 - accuracy: 0.6600 - val_loss: 0.5808 - val_accuracy: 0.6870 - 18s/epoch - 177ms/step
Epoch 10/20
100/100 - 19s - loss: 0.6060 - accuracy: 0.6695 - val_loss: 0.5914 - val_accuracy: 0.6730 - 19s/epoch - 186ms/step
Epoch 11/20
100/100 - 17s - loss: 0.5995 - accuracy: 0.6690 - val_loss: 0.5658 - val_accuracy: 0.7200 - 17s/epoch - 174ms/step
Epoch 12/20
100/100 - 20s - loss: 0.6027 - accuracy: 0.6680 - val_loss: 0.5672 - val_accuracy: 0.6880 - 20s/epoch - 200ms/step
Epoch 13/20
100/100 - 18s - loss: 0.5911 - accuracy: 0.6830 - val_loss: 0.5652 - val_accuracy: 0.7020 - 18s/epoch - 177ms/step
Epoch 14/20
100/100 - 18s - loss: 0.5920 - accuracy: 0.6845 - val_loss: 0.5560 - val_accuracy: 0.7080 - 18s/epoch - 185ms/step
Epoch 15/20
100/100 - 18s - loss: 0.5768 - accuracy: 0.6955 - val_loss: 0.5576 - val_accuracy: 0.7070 - 18s/epoch - 177ms/step
Epoch 16/20
100/100 - 21s - loss: 0.5678 - accuracy: 0.7035 - val_loss: 0.5567 - val_accuracy: 0.7070 - 21s/epoch - 210ms/step
Epoch 17/20
100/100 - 19s - loss: 0.5707 - accuracy: 0.6950 - val_loss: 0.5518 - val_accuracy: 0.7020 - 19s/epoch - 185ms/step
Epoch 18/20
100/100 - 18s - loss: 0.5654 - accuracy: 0.7090 - val_loss: 0.5239 - val_accuracy: 0.7330 - 18s/epoch - 177ms/step
Epoch 19/20
100/100 - 17s - loss: 0.5555 - accuracy: 0.7135 - val_loss: 0.5907 - val_accuracy: 0.6840 - 17s/epoch - 175ms/step
Epoch 20/20
100/100 - 19s - loss: 0.5562 - accuracy: 0.7080 - val_loss: 0.5270 - val_accuracy: 0.7390 - 19s/epoch - 186ms/step

Model learning curve is likely to be bumpy when you augment images with transformations other than rescale.
Please remember that augmentations other than rescale generate new variations of the data each time the same batch is encountered during training.

Thanks for your explanation, it is totally understood now. If I may ask, I have another question:
what is the indication of validation accuracy being higher than training accuracy?

This happens when validation dataset is a lot easier to classify than the training dataset.

When the distribution of validation data covers a smaller portion of the training data distribution, it’s possible for the model to learn certain aspects of the training data well enough to perform well on validation data. Do keep in mind that augmentations increase the distribution of training data points.

Thank you, it is all now obvious.