Why Flatten after GlobalAveragePooling2D in C3 W1 Lab2 Classifier?


Why Flatten after GlobalAveragePooling2D in C3 W1 Lab2 Classifier?

The shape doesn’t change, there doesn’t seem to be any learnable parameters in either layer …

I can understand the need for GlobalAveragePooling2D after the feature extraction portion but why is it followed by a Flatten?

I can also understand the Dense after the Flatten … just cannot figure out the reason for the Flatten after the GlobalAveragePooling2D …

Warmest Regards
Steven Sim

There’s no need for a Flatten layer when input & output shapes remain the same.

Thank you for replying!

Then perhaps somebody should change the C3 W1 Lab2 Classifier …

Warmest Regards
Steven Sim

The flatten is needed so you can feed all those into a dense layer.

Flatten is not required to feed inputs to the Dense layer. It is meant to convert an n-dimensional input to a flat output. Please see the example in this link. Outputs from GlobalAveragePooling2D can be fed directly to a Dense layer.


The point in my original post is that the person who prepared the C3W1 Lab 2 Classifier definition placed BOTH GlobalAveragePooling2D AND Flatten in place

I was trying to figure out the reason.

If it’s only necessary to have only one of the two (either GlobalAveragePooling2d OR Flatten), then perhaps somebody should remove it in that lab

As an aside, when should we use Flatten and when to use GlobalAveragePooling2D since both result in a flatten tensor which can be ingested by a Denes layer?

Warmest Regards
Steven Sim


Please see Flattenand GlobalAveragePooling2D. For each entry in the batch, GlobalAveragePooling2D takes average of all entries per channel.

Choosing a layer is a hyperparameter that you must work out for the problem in hand.

I agree with your suggestion of fixing the lab. @gent.spah is the official mentor for advanced tensorflow. He is the one who can file a github ticket for this.

Hope this helps.

Hi guys I appreciate your thoughts and you might as well be right, but my intuition tells me that even though it might be possible to feed globalaveragepooling directly to dense the results will not be the same and you Steven could try it. The reason I say so is that the averagepooling is a convolutional process, a different spatial arangement of learned parameters, whilst the flatten vector is a 1d vector of features. @chris.favila hope you doing well, could you have a look into this as @balaji.ambresh suggests. Thank you.


Thank you for getting more heads to look at this. Could you please share this post on your mentor forum and maybe raise a ticket?

@Steven_Sim. Please vote on this thread to increase visibility.

Ping +1 for an update. Thanks.

Aug 25
Ping +2 for an update.

Oct 16
Ping +3 for an update

Nov 26
Ping +4 for an update

Hi Balaji,

I already let know Chris Favilla about this above if he wants to have a look on it. To me its fine as it is.