I have developed a Deep CNN network and trained with gray scale images (in total 7800 Images). While training, some 50 epochs helps only to increase the Acc by 4% while training loss decreased by very small amount like 0.6858 to 0.6325.
I am really wandering on hyper-parameters learning rate, weight_decay value. I trained the model almost for 100 epoch, but convergence is looks very crucial, and variance being higher while training for higher epoch.
Any kind of suggestion will be welcomed ? Thank you.