Skip connections in ResNets

Uwe_Z · September 25, 2021, 7:43pm

Hello,

it is said, also in this course, that skip connections would enable the network to easier learn ithe identity function (which is linear). On the other hand on the slides each skip connection is fed, after skipping one, two or three blocks, into the (then following) nonlinearity (ReLU etc.) in addition to the (then following) activation z. This causes a nonlinearity lying also on every “skip”-path though the network. The ability to learn the identity function (or a linear function) is therfore restricted on a very short partial path though the network (very local). At least as I see it, this reduces the ability of the network to learn the (linear) identity function, because also every (global) “skip” signal path is distorted by a nonlinearity / many nonlinearities – therefore making it more difficult for the gradient to backpropagate also on these paths. Why is there no global path from input to output that contains no nonlinearity at all, when the goal for the network is to better learn the identity function (or a linear function)?

Thank you very much in advance – Uwe

jonaslalin · September 28, 2021, 7:33am

Hello @Uwe_Z,

For example, in the U-Net architecture, we have longer skip connections:

You will get a chance to work with the U-Net model in Course 4, week 3.

The goal of the network is to learn a mapping function from input to output. The optimal mapping is very rarely the identity mapping, only if inputs are equal to outputs. Skip connections help gradients flow better, thus improving learning and our chances of learning a good mapping from input to output, involving a deep architecture, with several consecutive layers.

Uwe_Z · October 3, 2021, 7:52pm

Thank you very much for your answer! Please excuse my late reply. I think the U-Net you showed answers my question. There are paths of increasing nonlinearities through the network. The top path contains only very few nonlinearities, the paths going through the layers below include more nonlinearities. Thank you again for your help

Topic		Replies	Views
W2 Test Question 5 Convolutional Neural Networks coursera-platform	2	574	October 29, 2021
Pls elobarate on how skip connection helps gradients backpropogate Convolutional Neural Networks coursera-platform	1	550	July 30, 2022
Quiz week 2, question Q5 Convolutional Neural Networks coursera-platform	2	532	November 7, 2021
Questions about the Residual Networks Convolutional Neural Networks coursera-platform	1	594	June 23, 2021
Why learning identity function will give RN better performance? Convolutional Neural Networks coursera-platform	2	560	November 4, 2022

Skip connections in ResNets

Related topics