How to improve Train Accuracy

Abhijit_D · February 11, 2022, 11:45am

I have prepared a solution for week 1 assignment cats vs dogs but somehow training accuracy is not improving. Validation accuracy is above 80% but training accuracy is stuck at c. 86% to 87%. I have tried various structures such as adding more layers as shown below but nothing is working. Can you please help me figure out what is missing?

[Removed code]

Abhijit_D · February 11, 2022, 11:47am

below is my output -

Epoch 5/15
2250/2250 [==============================] - 94s 42ms/step - loss: 0.3982 - accuracy: 0.8366 - val_loss: 0.3962 - val_accuracy: 0.8308
Epoch 6/15
2250/2250 [==============================] - 93s 41ms/step - loss: 0.3952 - accuracy: 0.8452 - val_loss: 0.4186 - val_accuracy: 0.8604
Epoch 7/15
2250/2250 [==============================] - 93s 41ms/step - loss: 0.3907 - accuracy: 0.8476 - val_loss: 0.6279 - val_accuracy: 0.8260
Epoch 8/15
2250/2250 [==============================] - 93s 41ms/step - loss: 0.3732 - accuracy: 0.8547 - val_loss: 0.3105 - val_accuracy: 0.8876
Epoch 9/15
2250/2250 [==============================] - 93s 41ms/step - loss: 0.3777 - accuracy: 0.8552 - val_loss: 0.9783 - val_accuracy: 0.8384
Epoch 10/15
2250/2250 [==============================] - 93s 41ms/step - loss: 0.3729 - accuracy: 0.8577 - val_loss: 0.3625 - val_accuracy: 0.8584
Epoch 11/15
2250/2250 [==============================] - 94s 42ms/step - loss: 0.3697 - accuracy: 0.8663 - val_loss: 0.2649 - val_accuracy: 0.8944
Epoch 12/15
2250/2250 [==============================] - 94s 42ms/step - loss: 0.3889 - accuracy: 0.8679 - val_loss: 0.3938 - val_accuracy: 0.8472
Epoch 13/15
2250/2250 [==============================] - 96s 43ms/step - loss: 0.3647 - accuracy: 0.8676 - val_loss: 0.3323 - val_accuracy: 0.8852
Epoch 14/15
2250/2250 [==============================] - 93s 41ms/step - loss: 0.3646 - accuracy: 0.8650 - val_loss: 0.4246 - val_accuracy: 0.8712
Epoch 15/15
2250/2250 [==============================] - 94s 42ms/step - loss: 0.3719 - accuracy: 0.8655 - val_loss: 0.2876 - val_accuracy: 0.8828

ai_curious · February 11, 2022, 12:34pm

Those layers with softmax are the likely culprit. That activation is not generally used for hidden layers in a binary problem. Try changing just that and let us know the results?

ps: to the best of my knowledge the reason is related to vanishing gradients and the difference between the output magnitude of relu Vs softmax (latter constrained to sum to 1.0). Maybe one of the math wonks can weigh in here?

Abhijit_D · February 11, 2022, 12:46pm

No softmax part is commented. I had tried but it didnt work. Effectively its this (same network as shown in videos):

tf.keras.layers.Conv2D(16, (3,3), activation=‘relu’, input_shape=(150, 150, 3)),

tf.keras.layers.MaxPooling2D(2, 2),

tf.keras.layers.Conv2D(32, (3,3), activation='relu'),

tf.keras.layers.MaxPooling2D(2, 2),

tf.keras.layers.Conv2D(64, (3,3), activation='relu'),

tf.keras.layers.MaxPooling2D(2, 2),
 
tf.keras.layers.Flatten(),

tf.keras.layers.Dense(512, activation='relu'),

 tf.keras.layers.Dense(1, activation='sigmoid')

Abhijit_D · February 11, 2022, 1:31pm

I think I solved it, just in place of RMS Proo used Adam and its working. Grateful if someone could clarify what is reason behind difference in performances of these two optimizers?

ai_curious · February 11, 2022, 3:45pm

lots of discussions comparing and contrasting the two if you search the interweb on “rmsprop vs adam”. This one covers a lot of ground: An overview of gradient descent optimization algorithms

and includes this paragraph:

In summary, RMSprop is an extension of Adagrad that deals with its radically diminishing learning rates. It is identical to Adadelta, except that Adadelta uses the RMS of parameter updates in the numinator update rule. Adam, finally, adds bias-correction and momentum to RMSprop. Insofar, RMSprop, Adadelta, and Adam are very similar algorithms that do well in similar circumstances. Kingma et al. [14:1] show that its bias-correction helps Adam slightly outperform RMSprop towards the end of optimization as gradients become sparser. Insofar, Adam might be the best overall choice.

NOTE: my emphasis added

Abhijit_D · February 13, 2022, 2:25pm

Thanks a lot! With RMS Prop, would it help if I reduce learning rate further? It feels like with rms prop the model is stuck at around 87% accuracy.

ai_curious · February 13, 2022, 4:34pm

One of the true math wonks might have a better answer for you, but my own response is somewhere between ‘no’ and ‘it depends.’ ‘No’ because RMSprop doesn’t use a constant learning rate anyway and its value at any given iteration is driven by circumstances at that moment. The starting point you provide for learning rate is just that, the starting point. ‘It depends’ because learning rate isn’t the only factor at play. The data itself, number of epochs, batch size, all make a difference. You could experiment yourself and plot curves, say iteratively change starting learning rate but hold others constant, then change number of epochs, etc. Note that 15 is a rather small number of epochs and you may not know yet if you have truly reached an optimum or are just at a plateau. Let us know what you find?

Topic		Replies	Views
Need support to improve model accuracy Convolutional Neural Networks in TensorFlow week-module-1	5	597	July 5, 2023
Course 2 week 1 accuracy Convolutional Neural Networks in TensorFlow week-module-1	10	675	October 16, 2022
Course 2 week 1 Assignment: Cats vs Dogs. Target is 95% accuracy Convolutional Neural Networks in TensorFlow week-module-1	4	866	March 19, 2022
C2W1 assignment accuracy Convolutional Neural Networks in TensorFlow week-module-1	1	480	August 3, 2023
Difficulty achieving training accuracy Convolutional Neural Networks in TensorFlow week-module-1	3	537	August 21, 2022

How to improve Train Accuracy

Related topics