Why skip connections are necessary?

Are skip connections necessary in big networks because of regularization driving down the values of W which prevent them from learning the identity function. If this wasnt done skip connections wouldn’t be needed right?

Here there is one recent post about this subject:

And I am sure there are plenty that discuss this issue if you search about it.

Hey, Thanks for the reply. My question is about a more specific instance though and I didn’t find anything that discusses this.

  1. Introducing the regularization parameter to prevent overfitting drives down values of W. This means W[l] can only take values like epsilon*w[l]. So even if it learns identity it won’t make much difference if W[l] is very small as it is driving down the values.
  2. So if there was no regularization parameter in the cost function would skip connections be useful?