Why use a Dense 512 node layer before the final binary classification node?

The code used in the course uses two Dense layers after the final max_pooling and flattening eg:
tf.keras.layers.Dense(512, activation=‘relu’),
tf.keras.layers.Dense(1, activation=‘sigmoid’)

What is the purpose of the Dense layer with 512 nodes? I tried doubling the number of nodes here (to 1024) and dropping it altogether, and didn’t notice much change in the accuracy of the network predictions. This was in the assignment at the end of Week 1.


Hi, @Brendon_Wolff-Piggot !

The final dense layers are the ones that are going to extract the information for the final output given the features extracted by the CNN’s. The exact number of neurons is nothing strictly stipulated and you can try with a different configuration to further optimize the network performance.

1 Like