Stochastic Gradient Descent convergence

Max_Rivera · November 2, 2021, 6:19pm

Why won’t SGD ever converge?

paulinpaloalto · November 2, 2021, 7:11pm

Where does it say SGD never converges? Do you have a reference? I think it’s the same answer as with everything here: it depends. Both on your data and your selection of the relevant hyperparameters.

Lina_Hourieh · July 25, 2022, 10:00am

Can you please give us an example?? When would Stochastic Gradient converge and when it won’t?

anon57530071 · July 25, 2022, 11:48am

Here is an example.

I used “Boston Housing price regression dataset”, which is part of Keras.

# load test data
boston_housing = tf.keras.datasets.boston_housing
(train_data, train_labels), (test_data, test_labels) = boston_housing.load_data()

# Shuffle the training set
order = np.argsort(np.random.random(train_labels.shape))
train_data = train_data[order]
train_labels = train_labels[order]

# normalize data
mean = train_data.mean(axis=0)
std = train_data.std(axis=0)
train_data = (train_data - mean) / std
test_data = (test_data - mean) / std

Then, created a small model which consists of 3 Dense layers to predict a housing price, and used “SGD” for optimization. Here is the result.

I run 3 times with re-shuffling the same data. Even if I’m using the same data, one trial did not converge.

As it is “Stochastic”, it is difficult to say whether it will converge or not, “deterministically”. So, as Paul said, what we can say is… “it depends”.

Topic		Replies	Views
Conflict in concept of a video and assignment Improving Deep Neural Networks: Hyperparameter tun	4	593	November 23, 2022
Epoch size of Stochastic Gradient Descent (SGD) Improving Deep Neural Networks: Hyperparameter tun	1	502	June 4, 2023
Gradient Descent vs Stochastic GD Improving Deep Neural Networks: Hyperparameter tun	4	585	October 21, 2021
Stochastic Gradient Descent Improving Deep Neural Networks: Hyperparameter tun	1	556	June 5, 2021
Stochastic Gradient Descent Definition Improving Deep Neural Networks: Hyperparameter tun	2	633	September 29, 2022

Stochastic Gradient Descent convergence

Related topics