Batch Norm/Drop out ordering

nickmuchi · July 1, 2021, 10:30pm

Hi there, I have been reading a debate about the ordering of BN and dropout in hidden layers. As an example below using keras:

As you can see, I have BN just after the input layer but befor activation function then dropout after? I have noticed performance differences when i change the order and cannot see much of a consensus online

eruzanski · July 6, 2021, 6:44am

Welcome to the forum @nickmuchi!

The question regarding the order of operations of batch normalization and dropout is a good one. As you point out, online research shows there are differing thoughts and approaches with different results.

My advice is to experiment using your particular model and data. Try changing the order of the BN and dropout layers, and even removing one. Compute a performance metric or metrics over something on the order of 10-100 realizations of each model (to account for random weight initializations). Note too the effects each configuration has on overfitting (mainly addressed by the dropout layer) and runtime (addressed by the BN layer).

You may or may not see significant performance differences, but if nothing else, you’ll learn a lot from this exercise! Depending on your available time and motivation, you could put together a thorough analysis and strong contribution to the online deep learning community.

nickmuchi · July 7, 2021, 1:02am

Thanks for the response, will give your suggestions a shot, much appreciated!

Topic		Replies	Views
Dropout - batchnorm or batchnorm - dropout : which order is the appropriate? Sequence Models week-module-3 , coursera-platform	3	565	January 8, 2025
W3A2 Does the order of layers matter? Sequence Models coursera-platform	3	521	November 24, 2022
Dropout Layer order - impact on the result? Natural Language Processing in TensorFlow	3	366	August 31, 2022
Where in the model to apply dropout layers? Convolutional Neural Networks coursera-platform	2	578	June 14, 2022
Is Batchnorm really necessary? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	614	July 12, 2022

Batch Norm/Drop out ordering

Related topics