Why should mobileNet have similar performance as conventional ConvNet?

Hi again @yinshan and @Yousif,

I wanted to give the pen and paper exercise a go :smiley:

Standard convolution

has (3 \times 3 \times 3 + 1) \times 128 = 28 \times 128 = 3584 parameters.

Depthwise convolution

has (3 \times 3 \times 1 + 1) \times 3 = 10 \times 3 = 30 parameters.

Pointwise convolution

has (1 \times 1 \times 3 + 1) \times 128 = 4 \times 128 = 512 parameters.

Depthwise separable convolution

thus has 30 + 512 = 542 parameters, compared to 3584 for the standard convolution.

Moreover, 542 / 3584 = 0.15.

Hence, the depthwise separable convolution only has 15\% of the number of parameters of the standard convolution (for this example)!

Thanks to Kunlun Bai for the amazing graphics:

5 Likes