I didn’t modify that notebook, but I did test this using tensorflow in the notebook for course 2, week 3. I found that Nesterov was a little bit faster to compute than Adam, but didn’t give quite as good a result on the test set. The graphs were interesting, showing some oscillations. I used process_time for timing, and
optimizer = tf.keras.optimizers.SGD(learning_rate, momentum=0.9, nesterov=True)
was my optimizer. I tried varying the learning_rate and the momentum up an down, but these hyperparameters worked best.
I’d like to paste in the graphs, but I don’t know how to do that. I try “copy cell attachment” but pasting that just gives me a giant block of text.
Here is the text of the epoch output, anyway.
Cost after epoch 0: 1.836860
Train accuracy: tf.Tensor(0.18055555, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.28333333, shape=(), dtype=float32)
Cost after epoch 10: 1.282068
Train accuracy: tf.Tensor(0.5009259, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.49166667, shape=(), dtype=float32)
Cost after epoch 20: 1.090269
Train accuracy: tf.Tensor(0.57685184, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.31666666, shape=(), dtype=float32)
Cost after epoch 30: 0.998442
Train accuracy: tf.Tensor(0.60925925, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.53333336, shape=(), dtype=float32)
Cost after epoch 40: 0.908589
Train accuracy: tf.Tensor(0.662963, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.55833334, shape=(), dtype=float32)
Cost after epoch 50: 0.857781
Train accuracy: tf.Tensor(0.69074076, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.60833335, shape=(), dtype=float32)
Cost after epoch 60: 0.752651
Train accuracy: tf.Tensor(0.73425925, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.60833335, shape=(), dtype=float32)
Cost after epoch 70: 0.897831
Train accuracy: tf.Tensor(0.675, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.59166664, shape=(), dtype=float32)
Cost after epoch 80: 0.684091
Train accuracy: tf.Tensor(0.7712963, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.65, shape=(), dtype=float32)
Cost after epoch 90: 0.785952
Train accuracy: tf.Tensor(0.7212963, shape=(), dtype=float32)
Test_accuracy: tf.Tensor(0.675, shape=(), dtype=float32)
67.626487752 secs