Hey @karra1729,

You will find this thread to be somewhat similar to your query. In this thread, Paul Sir has described how back-propagation works in ResNets.

However, if you are clear as to how the back-propagation works in ResNets, but are unclear as to how the residual connections help the gradients back-propagate more efficiently, then here’s my two cents.

You must be already familiar with the fact that deep neural networks tend to suffer from **vanishing gradients** due to continuous multiplication of smaller and smaller weights with the gradients. Also, you must be familiar with the fact that in an identity connection (*used by a residual block*), there’s no weight, or you can say that the weights are all equal to 1. So, when a gradient will back-propagate via an identity connection, it will just be multiplied by 1, and hence, will not shrink in value, and thus propagate for longer distances, or in other words, **more efficiently**.

A similar query as to yours can be found here as well. You will find this article mentioned in this thread, which provides an **Intuitive explanation of Skip Connections in Deep Learning**.

Let me know if this helps.

Cheers,

Elemento