Why should mobileNet have similar performance as conventional ConvNet?

Hello @yinshan and @Yousif,

Great question! Sorry for the late reply. Sometimes, it is worthwhile bumping a thread if it becomes stale because it might happen to fly under the radar of mentors :sweat_smile:

The depthwise separable convolutions reduce the number of parameters in the convolution. As such, for a small model, the model capacity may be decreased significantly if the 2D convolutions are replaced by depthwise separable convolutions. As a result, the model may become sub-optimal. However, if properly used, depthwise separable convolutions can give you efficiency without dramatically damaging your model performance.

The key take away, again, is:

The depthwise separable convolutions reduce the number of parameters in the convolution.

As an exercise, make up a toy example and calculate the number of params for each method. Prof Andrew Ng demonstrated the number of multiplications, but not the number of params. It is a good exercise and might help you understand even better.