Dropout regularization

In Inverted dropout, we want some units to be zeroed out so that the complexity of the Neural Network decreases.
After you multiply a3 with d3(A random boolean matrix where elements are less than keep_prob)
, you get a matrix a3 with some elements randomly zeroed out which means the zeroed element position indicates that particular hidden unit is eliminated.

But the reason behind scaling (a3/=0.8) is that every value in the matrix is being affected and that should not be the case

Please correct me If I am wrong!

Hi, @ajaykumar3456.

You’ll actually have to implement this in week one’s second assignment! I’m pretty sure you won’t have any problems with it, and I think you should probably remove the code :sweat_smile:

I tried to explain the reason behind scaling here. Let me know if that helped!

1 Like