Perform SGD vs. Adam,

After completed the deep learning specialization course, I implemented a neural network with numpy from scratch with sgd(stochastic gradient descent ) and Adam(adapted moment estimation). My test datas are generated with sklearn.dataset.make_moons.

It comes out the

Optimizer Training Accuracy Test Accuracy
SGD 0.9699 0.9715
Adam 0.96955 0.97065

The train data samples : 80000, test data samples: 20000.

  1. SGD slightly outperforms Adam. Is it calculation noise or normal case?

  2. Test accuracy is slightly higher than the training accuracy, Is something wrong or just noise?

The differences in performance you see are not statistically significant. So I would not worry about it.

1 Like