Advice on training a ViT model with PyTorch

Hey there :wave:,
I recently implemented a ViT model with the help of PyTorch, I thought of performing a simple classification task on it, so I trained the model on a Leaf Disease classification dataset.

After 60 epochs the accuracy only increased to 68%, is this behavior normal, also accuracy doesn’t increase after further epochs. It kind of bounces around 68, I know the ViTs are challenging to train but shouldn’t the accuracy improve at some point?

you can check the uploaded kaggle notebook, I’ve trained the model using pytorch.

Can you tell me what I’m doing wrong, or what I should be doing?
vit-with-pytorch.ipynb (59.5 KB)

ViT architecture needs allot of data its not like CNN.
so I think you need more data.