Hi! Could you please explain mathematically how backprop is done in Resnet? To be honest, from the content of the course I didn’t understand how backprop skips the layers and how gradients are calculated mathematically. I’ll appreciate your help!
It’s an interesting question, but it is not covered at all in the course material. Note that by this time in the course, we are doing everything using TensorFlow and Keras. One of the advantages of TF is that it handles the training and backpropagation automatically for us and we only have to specify what the forward propagation flow is and what optimization method to use and they take care of the rest for us “under the covers”. In the case of any network architecture with “skip” connections, the computational graph is more complicated: there are now nodes in the graph that have two outputs and others with two inputs. So each forward connection between two nodes is a function and it has a derivative, which is computed by doing finite differences as you are running forward propagation. Then during back prop it applies the computed gradients. So at the node which sprouts the skip connection and the “straight through” connection, you’ll have two outputs during forward prop and two inputs during back prop. Presumably they either add or average the gradients when they apply them to the originating node.
I am just saying the above on general principles. I have not read the ResNet paper. If you want to know more, the two approaches would be to read the ResNet paper and see if they address this point and to look at the gradient tape documentation in TF and see if they say anything about this type of issue.
Hello @Liudmila_Kiseleva,
As Paul suggested, if you google how tensorflow gradient tape calculates gradient, you should be able to find that they do it using the “reverse mode differentiation”. I have read about it before, so I would like to share this article with you about it. The article has an example which also contains a skip connection, and I think you will be happy to see that if Resnet is your concern
Cheers,
Raymond