Regularization by Inverted Dropout

After seeing the videos, dropout makes a lot of sense to me and I can get the basic intuition on why it works.

What I did not really understand is the division by the “keep probability” for the a. Andrew was a bit too fast for me here. Can anybody explain a bit more on that?

Hi @ralfphonso,

There have been some interesting discussions about it, I’m sure if you search you will find them. Here I send you a couple.