Why do we need to scale A after dropout

aesmat · July 24, 2021, 4:44pm

In week 1, prof NG mentioned we need to scale the A matrix after dropout to keep E[A] constant. I am trying to understand why do we need to keep E[A] constant in the first place.

nramon · July 26, 2021, 7:12pm

Hi,@aesmat.

The problem is neurons are only dropped during training. If you didn’t scale it, the values of the activations your network would see during inference would be very different from those it saw during training. I think it’s clear why that would be a problem, but let me know if it isn’t.

Good luck with course 2

Topic		Replies	Views
Course 2 -- Week 1 -- Dropout Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	740	June 28, 2021
Expected Value Stays the Same After Scaling In Dropout Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	570	October 4, 2021
Drop out gradient (scaled again) Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	523	August 23, 2021
Week 1 - Doubt in Dropout Regularization lecture video Improving Deep Neural Networks: Hyperparameter tun coursera-platform	7	758	June 16, 2021
Dropout regularization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	674	May 10, 2021

Why do we need to scale A after dropout

Related topics