About Dropout in CNN

Feihong_YANG · March 27, 2023, 3:51pm

There are a few Dropout layers applied in U-net, still confused how dropout is applied in CNN, in FC layer it actually mute some node’s output by chance, but how to apply the concept to CNN after conv2d layer?

paulinpaloalto · March 27, 2023, 3:58pm

Dropout works the same way after a Conv layer: it randomly zeros some of the activation outputs to weaken the connections. It’s just that the shape of the input and output is a 3D tensor instead of a vector, but the effect is the same. You can think of each position in the conv output as a “neuron” just in the same way as with a Fully Connected layer.

Juan_Olano · March 27, 2023, 6:27pm

Adding to @paulinpaloalto clear answer, I’d like to share how I apply dropout in a CNN.

A typical ‘module’ of a CNN that I use has the following components:

The Convolutional layer itself (like a Conv2D layer)
Followed by an activation (usually ReLU)
Followed by a Pooling (usually MaxPooling).

This ‘module’ structure is repeated 1 or more times before getting to the head of the model.

Then the ‘head’ of the model has a usually a Flatten layer, followed by one or more Dense layers until the last layer that can be what you need, depending on the task of the model.

Given the above general structure, I would insert Dropout layers in zero or more of the following:

In between ‘modules’. For example ‘module_1’ - Dropout - ‘module_2’ - Dropout…
May be I would experiment carefully when adding the dropout before the Flatten Layer.
Between the Dense layers.

There’s no defined formula on when, where, how to add the Dropout. It will depend very much on your task, so experimenting with several combinations is a due task in your implementation.

I hope this adds a bit more light to your question

Juan

paulinpaloalto · March 27, 2023, 7:09pm

Thanks, Juan, for all the additional great explanations! The only thing I would add is that looking at the various models Prof Ng shows us here in Course 4, it’s common that there can be several conv layers back to back before a pooling layer. Meaning that there is more than one way to reduce the height and width dimensions: you can do it with a conv layer with “valid” instead of “same” padding or with non-zero stride. So it is not necessary to have a pooling layer after each conv layer. All these network architecture questions are decisions that need to be made by the designer in each case. If you are lucky, you can find an existing model that worked well in a problem that is at least somewhat similar to the new problem you are trying to address. Start with that architecture and make further adjustments to tailor it to your problem.

Juan_Olano · March 27, 2023, 7:13pm

Thanks @paulinpaloalto ! It is totally like you say! I didn’t mean to lead to a single structure of CNNS modules - but now that I read it it looks like that, so my apologies

For some reason every time I create a CNN I do it by building these ‘modules’, but as you explain, this is a choice I made. But definitively, the structure of the CNN can be anything the designer needs/decides.

Thank you for clarifying this!

Juan

Feihong_YANG · March 28, 2023, 3:50am

Thanks for the clarification. If I understand it correctly, for each input value(activation of previous layer) there is a chance to be set with 0, and it will kept as 0 when do the conv computation with all the follow channels filter (rather than do dropout to the input before filter conv for each channel). Let me know if there is a mistake.

paulinpaloalto · March 28, 2023, 2:18pm

Yes, the dropout is applied to the output of a given layer. Of course in any neural network (fully connected, convnet, …), the output of one layer is the input to the next layer.

Topic		Replies	Views
Which layer to use dropout regularization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	4	490	October 3, 2023
Where in the model to apply dropout layers? Convolutional Neural Networks coursera-platform	2	597	June 14, 2022
Week 3 Ass. 2: Dropout in Conv. Nets? Convolutional Neural Networks coursera-platform	1	593	May 12, 2021
Dropout Layer and some extra questions! Natural Language Processing in TensorFlow	3	408	December 21, 2022
Dropout Layer order - impact on the result? Natural Language Processing in TensorFlow	3	371	August 31, 2022

About Dropout in CNN

Related topics