In TF1 Course 1 and 2, when building a Neural Network, we used the Flatten layer and the first Dense layer with 512 units (or units > 100).
However, in Course 3 (NLP), besides using GlobalAveragePooling1D instead of Flatten as one of the layers, we also reduce to 6 units the first Dense layer.
Can someone help me understand why such a reduction happens?