In the lecture the following describes the reversible residual layers to back calculate X1 and X2. It was not clarified what is the difference between X1 and X2. I thought they are the same, one is duplicate of the other, am I wrong? if so why do we need to back calculate both X1 and X2. if they are different, what is the difference?
“Reversible residual layers allow you to reconstruct the forward layer from the end of the network. Usually you have two similar branches in the network that you use to compute the network.”