I watched few times video Why ResNets Work? And I still can’t understand sense of what we do
So, we pass output of layer additionally to computation to some of next layers. And? What is it at all? Ok, I listened, that it can be usefull, when weight decay is using, because intermidiary layers can only be skipped when w is near to zero.
So, all of this we need only for regularization? Or what?