No need to apologize!
The problem of symmetry is something different. Dividing by keep_prob
is compensating for this:
(source)
From what I understood, this is the formal justification, but an example may be clearer:
>>> m = 1000
>>> keep_prob=0.5
>>> r = np.random.binomial(1, keep_prob, m)
>>> y = np.random.normal(1, 0.1, m)
>>> np.mean(y)
1.0004794606768865
>>> np.mean(r * y)
0.5123941559332359
>>> np.mean(r * y / keep_prob)
1.0247883118664718
Let me know if something doesn’t make sense