Hi ,
in mobile net v2 , we use residual connection in the short cut path or the skip connection path .
And in the normal or the main path we apply a bottleneck block which compromise expansion and Depthwise and projection convolutions .
But I can’t workout how this bottleneck block is useful since we are using residual connection which doesn’t go through this block ?
I Assume that the residual connection will pass the activation of the current layer to the next layer , to make it easier for it to learn the identity function , while simultaneously the input image will go through the main path Normally (EXPANSION → DepthWise → Projection ) .
am I correct ?
The whole point of the residual connection is to avoid gradient diminish or explosion i.e. instabilities, occurring in deep NN’s. So a copy of a previous less processed input is fed later on again to still have features to learn from in case the gradients have become problematic.