Both ResNet and MobileNet utilize the residual connections. It’s unclear to me whether the residual connection path is taken at training or testing or both? If both, I dont’ understand when the non-residual path ie. the expansion->depthwise-projection->, in case of the mobileNet, is carried out if these elements are skipped over? and why would the contents of the standard path matter if they are skipped. Thx
The skip connections are part of the network, right? The point is that the skip values get merged with the standard path results, so that both of them contribute to the output of any given segment of the network. It’s not that you’re selecting one or the other. Both paths get taken always and both outputs matter: in training and testing. And gradients propagate on both paths during training.