C1_W3_Lab_1: Odd Result When Removing Conv Layer

Steven1 · August 24, 2023, 2:21pm

In C1_W3_Lab_1_improving_accuracy_using_convolutions it is suggested that we try removing the 2nd of 2 Conv layers to see how this will affect training. I had expected the speed to increase, but the accuracy to decrease. Instead, I saw improved accuracy without the second conv layer.

The base model was:

# Define the model
model = tf.keras.models.Sequential([
                                                         
  # Add convolutions and max pooling
  tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2,2),

  # Add the same layers as before
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')
])

And when fit using Adam optimization and sparse categorical cross entropy loss, I saw this performance:

MODEL TRAINING:
Epoch 1/5
1875/1875 [==============================] - 9s 4ms/step - loss: 0.4747 - accuracy: 0.8303
Epoch 2/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.3139 - accuracy: 0.8852
Epoch 3/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2732 - accuracy: 0.9001
Epoch 4/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2414 - accuracy: 0.9111
Epoch 5/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2185 - accuracy: 0.9186

MODEL EVALUATION:
313/313 [==============================] - 1s 3ms/step - loss: 0.2717 - accuracy: 0.8993

However, upon removing the 2nd conv layer, I saw this improved performance:

MODEL TRAINING:
Epoch 1/5
1875/1875 [==============================] - 9s 4ms/step - loss: 0.4500 - accuracy: 0.8369
Epoch 2/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.3004 - accuracy: 0.8895
Epoch 3/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2533 - accuracy: 0.9058
Epoch 4/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.2211 - accuracy: 0.9172
Epoch 5/5
1875/1875 [==============================] - 7s 4ms/step - loss: 0.1972 - accuracy: 0.9271

MODEL EVALUATION:
313/313 [==============================] - 1s 3ms/step - loss: 0.2692 - accuracy: 0.9001

Why should I get a better fit to training data with fewer free parameters? Why is the 2nd conv layer, apparently, harmful?

TMosh · August 24, 2023, 4:37pm

A change from 89% to 90% is not really significant. You might see that much difference just re-training an existing model several times even without changing it. This is due to deep models having non-convex cost functions, so the random weight initialization will potentially lead to slightly different solutions each time.

Topic		Replies	Views
Accuracy goes up after removing layers? Introduction to TF for Artificial Intelligence ... week-module-4	4	624	March 24, 2022
More complex model causes decreased accuracy? Convolutional Neural Networks in TensorFlow week-module-4	2	612	December 8, 2021
One layer of conv2d and pooling is better than 2 layers? WHY? Introduction to TF for Artificial Intelligence ... week-module-3	1	557	February 1, 2022
How does loss of information lead to better accuracy Introduction to TF for Artificial Intelligence ... week-module-3	1	524	July 31, 2022
W4 Assignment - Success with Minimal Architecture Introduction to TF for Artificial Intelligence ... week-module-4	7	72	August 29, 2024

C1_W3_Lab_1: Odd Result When Removing Conv Layer

Related topics