Impact batch size

Hi, I was struggling to achieve 95% accuracy for the training accuracy. In the end, I found out that the batch size of 20 was not sufficient. After changing it to 100, I got the desired results.

I have read the documentation, but could not find a proper (understandable) answer to the impact of the batch size, Could anyone explain it to me?

Thanx in advance

Hi, @christian_Van_Rodijn !

As you could imagine, having a greater batch size means that the optimizer “sees” more data on each step. That can cause the gradient descent to give a “better informed” derivative that can achieve a better overall result

thanks for the clear explanation