Regularization question

In week 1, Regularization video, regularization term is divided by number of samples m.

L = \dfrac{1}{m} \sum L + \dfrac{\lambda}{2m}\|W\|^2_F.

Why is it done?
If I add duplicate samples, cost function stays the same (m doubles) but regularization weight gets 2 times smaller…

Dividing by ‘m’ reduces the amount of regularization for large data sets.
This is a common practice which seems to work well - there is no mathematical basis for it.