Is the formula for non-random weight initialization the same with dropout?

When learning about non-random weight initialization to prevent vanishing/exploding gradients, we see that W[l] = np.random.randn([shape of layer L]) * np.sqrt(1/n[l-1]).

Does this initialization work with dropout regularization, seeing as n[l-1], the number of nodes in layer L-1 connected to each neuron in layer L, changes each time?

It does seem like the noise introduced by Dropout could affect variance propagation, and there are initialization strategies that take this into account, but I wouldn’t be able to recommend a specific one.

