How will two units in a dense layer reach different weights and biases?

I have watched the backpropagation videos, but I am still a little confused about how neuron networks learn. In the previous week’s videos, I remember, when dense layers were introduced in the example of “best seller product”, Andrew mentioned that we feed all units of the same layer the same input. What I find confusing is: if the two units are in the same layer, have the input, and weights of both are being change by the same algorithm, how will the reach different weights from one another? and If they do not, what is the point of having 2 units doing the same thing?

The key factor is that every unit starts with a different small random initial weight. It’s called “symmetry breaking”.

This is enough to guide each unit to learn a different final weight.