For question 10, we’re asked to calculate the number of parameters of a MobileNet V2 bottleneck block.
We have a input volume that is n x n x 5 and that 30 filters are used for the expansion.
in the depthwise convolution we use 3x3 filters and 20 filters for the projection.
A bias is not used.
How is the number of parameters calculated?
I watched the videos again but it doesn’t seem to be explained.
Is the number of parameters independent of the size of the input volume?
In a conv layer, the size of the input matters for the number of parameters only in the channels dimension, because it’s the filters that are the parameters. So if the filters are 1 x 1 x 5 or 3 x 3 x 30, it doesn’t matter what the h and w dimensions of the various inputs are, right?
Right, I guess that’s one of the primary benefits of convolutional networks, we can have a very large input/image without the number of training parameters exploding.
I think I became a bit confused with the number of computations required which which was said to be: Filter_parameters x filter_positions x filters.
Am I correct in saying that for computational cost, the input size does matter because it affects the filter_positions?
Back to the original question.
I’ll try redo the quiz until I get the same question again but I don’t think I understand perfectly yet.
From week1 videos, we have:
parameters = (f x f x n_c_[l-1] + 1) x nc_[l] (I hope this notation is readable)
Since the question says we don’t have a bias, would the calculation be
(3 x 3 x 30 ) x 20
The answer I get seems too high if I remember the possible choices.
Yes, you’re right that the input size matters for compute cost, because it affects the numbers of “steps” you have to do with each filter. But the number of parameters is independent of the input h and w dimensions, but not the c dimension.
Yes, that sounds like the right number of parameters with no bias, f = 3, 30 input channels and 20 output filters. It’s been quite a while since I last took that quiz, so I forget exactly the shape of the block they are asking about, but it may also have a depthwise layer.
The expansion filters use 5 × 30 = 150 parameters, the depthwise convolutions need 3 × 3 × 30 = 270 parameters, and the projection part 30 × 20 = 600 parameters.
the expansion filters use 5 × 30 = 150 parameters, the depthwise convolutions need 3 × 3 × 30 = 270 parameters, and the projection part 30 × 20 = 600 parameters.