Wk2, MobileNet architecture video - Residual/skip connection

Ok, I think this boils down to how to interpret what Prof Ng means by that diagram. I looked at model.summary() of the MobilNet model in the Transfer Learning assignment. He’s not really showing a normal ConvNet diagram here as he did in C4 W1 where each block shows the output of that block. The entire diagram shows one complete “Bottleneck” section. The input to that whole section is n x n x 3 and that same input is also the input to the “skip” connection. Then in the bottleneck layer, you do the following operations:

  1. The expansion, which is a 1 x 1 convolution with 18 output filters. So each filter is 1 x 1 x 3. That produces the second box on the diagram.

  2. To get from the second to the third box in that diagram, you do a “depthwise” convolution. The input and output of that operation are both n x n x 18.

  3. Then you do another 1 x 1 convolution to reduce the channels back to 3. So there are 3 filters there, each of dimension 1 x 1 x 18. That produces the output which is n x n x 3.

  4. The “skip input” shaped n x n x 3 is added to the output of step 3) to form the output of the entire operation.

So that is what the above diagram actually means. With that interpretation, I claim that my way of reading it matches the way the picture is drawn. You are simply interpreting the meaning of the boxes in the drawing differently, but we are both saying the same thing in terms of the actual operations that happen.