How does loss of information lead to better accuracy

So, I’ve been looking into the code for C1_W3_Lab_1.

# Define the model
model = tf.keras.models.Sequential([
  # Add convolutions and max pooling
  tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(28, 28, 1)),
  tf.keras.layers.MaxPooling2D(2, 2),
  tf.keras.layers.Conv2D(32, (3,3), activation='relu'),
  tf.keras.layers.MaxPooling2D(2, 2),

  # Add the same layers as before
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dense(10, activation='softmax')

# Print the model summary

# Use same settings
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Train the model
print(f'\nMODEL TRAINING:'), training_labels, epochs=5)

# Evaluate on the test set
test_loss = model.evaluate(test_images, test_labels)

From what I understand, Convo2D is used to convolute the 26x26 matrix into 32 smaller matrices. This means each matrice will have lost a lot of data. Then we use MaxPooling2D(2, 2). This method further causes data loss. Converting 2x2 matrix to 1x1. That’s another 25% data loss. Again, we repeat this process losing even more data.

Which is further proven by this graph
download (7)

So, Intuition says, Since there are less data pieces available. This means classification would be inaccurate. Just like when your vision blurs, you can’t correctly identify the object.

But surprisingly, the accuracy here goes up.

Can anyone help me figure out why?

Please post your question in the right topic. I’ve fixed your post this time.
Conv layer produces an output of smaller height and width since the filters are used to create summaries over smaller volumes. To put it another way, the output of convolution is a summary of a block of matrix entries. Please watch the lecture videos in course 2. If you need more details, I encourage you to do deep learning specialization where you’ll implement both forward and backward passes of conv and max pooling layers from scratch.

1 Like