Perform SGD vs. Adam,

Xiaohong · November 22, 2025, 9:25pm

After completed the deep learning specialization course, I implemented a neural network with numpy from scratch with sgd(stochastic gradient descent ) and Adam(adapted moment estimation). My test datas are generated with sklearn.dataset.make_moons.

It comes out the

Optimizer	Training Accuracy	Test Accuracy
SGD	0.9699	0.9715
Adam	0.96955	0.97065

The train data samples : 80000, test data samples: 20000.

SGD slightly outperforms Adam. Is it calculation noise or normal case?
Test accuracy is slightly higher than the training accuracy, Is something wrong or just noise?

TMosh · November 22, 2025, 9:30pm

The differences in performance you see are not statistically significant. So I would not worry about it.

Topic		Replies	Views
Course2 week 2 assignment Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	596	August 27, 2021
'sgd' optimizer Introduction to TF for Artificial Intelligence ... week-module-1	1	660	September 13, 2021
Choice of SGD instead of Adam Sequences, Time Series and Prediction week-module-4	1	559	July 10, 2022
Stochastic Gradient Descent Definition Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	689	September 29, 2022
DLS Course 2 Week 2: adam is the worst algo Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	604	July 22, 2023

Perform SGD vs. Adam,

Related topics