Backpropagation in ResNets

paulinpaloalto · September 24, 2021, 3:50am

It’s a legitimate question, but note that we have now graduated to doing everything in TensorFlow. One important side effect of using a platform like TF (or PyTorch, Kaffe or …) is that the mechanics of backpropagation and gradients are now taken care of for us by the platform “under the covers”. You can imagine what must happen in principle: the point of Residual Networks is that we have the shortcut path to provide a normalizing influence on things. So the graph of forward propagation is no longer simply connected. The flow of gradients being applied as we backpropagate must mirror the forward propagation structure, of course: at the point at which each shortcut path diverges, you will have two independent gradients feeding backwards to that node. They will need to be combined in some way. Intuitively it probably works to average them. The notebook gives the reference to the original paper that defined all this. If you want to probe more deeply, have a look and see if they mention anything about how backpropagation works in this architecture.

Topic		Replies	Views
Backprop in ResNet Convolutional Neural Networks	2	688	December 22, 2022
ResNet confusion - Course 4 - Week 2 Convolutional Neural Networks	4	547	April 28, 2022
Calculating gradients using backpropagation using frameworks like TF Sequence Models	2	356	September 29, 2023
Pls elobarate on how skip connection helps gradients backpropogate Convolutional Neural Networks	1	538	July 30, 2022
ResNets Question Convolutional Neural Networks	5	593	June 20, 2024

Backpropagation in ResNets

Related topics