Why we normalize the output of the residual block?

Hi,

Why are we trying to make the output here to be standard normal distribution.

Thanks

I think it is because it makes the training step easier.