Resnet identity mapping

Hi @Noam_Mizrachi

I believe I understand your confusion. Take a look at this image of a residual block:
image

When we state that “the identity function is easy for a residual block to learn” what we mean is that, due to the skip connection, if F(x) in the image approaches 0 (due to, for example, L2 regularization), the ResNet block will not return 0 (the activation of the ResNet block will not be 0), it will be X, the original tensor that entered the Residual block.

To be clear, ‘the identity’ or the ‘identity function’ is “a transformation that leaves an object unchanged”, therefore, in the image, the arrow that says ‘identity’ means we carry the X tensor forward to the summation as it is, without altering it, as the identity function does nothing to it.

Saying ‘apply the identity transformation/function to the tensor’ instead of ‘do nothing to the tensor’ is an idea brought to deep learning from matrix operations in mathematics, where the identity transformation is the transformation you apply to a matrix X and get X as a result.

For a more in depth explanation, rewatch Andrew Ng’s explanation here, it is very good: Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera

Otherwise read this, which may also help: on the topic read this:

1 Like