After completed the deep learning specialization course, I implemented a neural network with numpy from scratch with sgd(stochastic gradient descent ) and Adam(adapted moment estimation). My test datas are generated with sklearn.dataset.make_moons.
It comes out the
| Optimizer | Training Accuracy | Test Accuracy |
|---|---|---|
| SGD | 0.9699 | 0.9715 |
| Adam | 0.96955 | 0.97065 |
The train data samples : 80000, test data samples: 20000.
-
SGD slightly outperforms Adam. Is it calculation noise or normal case?
-
Test accuracy is slightly higher than the training accuracy, Is something wrong or just noise?