Hello everyone,

First off, I am not sure if the nomenclature used in the title makes things clear, but just to clarify, this question comes from the Deep Learning Specialization, Fourth course “Convolutional Neural Networks”, Week 2, Test Question # 10.

The question basically asks to pick the correct values for W, Y, Z, with the following info:

Following Andrew’s steps in the lesson presentation, my reasoning went as follows:

- Expansion Filter: 1x1xW, where W=5 because it has a depth similar to the input block
- Expansion Block: nxnxY, where Y = ~6*W and thus Y=30
- There are 20 “pointwise” filters (which I am gonna temporarily call “PWF”)
- Projecting block: nxnxZ, where Z= PWF=20

Therefore, **my** choice would be “W=5, Y=30, Z=20” (taking advantage of this query, it would be nice to know also, if my reasoning is correct ). However, here is where the confussing part appears: **I have chosen that option (as well as the other 3, just to try), but in all of the 4 selected options, the scorer marks them as “wrong” all the times! Why is this happening?**

Any thoughts on this weird behaviour is appreciated, thank you.

I think that either you must be misinterpreting the output of the quiz grader or maybe there was something wrong on the servers when you tried your experiment. I agree with your reasoning and I chose W = 5, Y = 30 and Z = 20 and the grader accepted that as correct. Mind you, I still have some other questions on that quiz that are hurting my head!

Well maybe what I said is not quite right: I agree with your answer, but I don’t understand the point about Y = 6 * W. The value of Y is completely independent of the value of W, right? Y is just determined by the number of output filters that you have at that layer (each of which is shaped f x f x 5).

Hello @paulinpaloalto , it is nice to see (read?) you again.

My thoughts:

- There is no “grader misinterpretation” from my side: I was hitting a 90%, and getting 0/1 points on question 10. Like I said before, I tried it 4 times already (unfortunately, I’ll have to wait 8H more until giving it a new try ), so I think that by now, I got to understand the scoring quite correctly ( )
- I agree with you in the “
*The value of Y is completely independent of the value of W*” assertion, but since therefore I had 2 possible “correct” candidates (Y=20 or Y=30), I think I relied on Andrew’s quick comment on the video " MobileNet Architecture" (around minute 4:13 to 4:36): “*A factor of expansion of six is quite typical in MobileNet v2 which is why your inputs goes from n by n by three to n by n by 18, and that’s why we call it an expansion as well, it increases the dimension of this by a factor of six*”… Thus, I ended up picking 30 instead of 20.

I will proceed like this: I am going to take a new chance on that test 8H later, and then take some screenshots of the answers I provide VS the score I am getting, so maybe by then one out of two things will happen: either the grader will go sane again, or I will have further documentation to continue studying my case (which will BTW remain open until then ).

See you then I guess, thank you.

Sorry, I missed or had forgotten the bit about 6x expansion being common. But they pretty clearly say in the picture that the there are 30 filters and that’s the way conv layers work, right? The number of channels in the output layer is the number of filters. And Y stays the same in the third layer and they show the input to the 4th layer has having 30 channels. Doesn’t seem a like a lot of deeper explanation required …

1 Like

Hello again @paulinpaloalto ,

Sorry about the time bumper here.

Indeed, the explanation was quite clear. I managed to successfully answer that query.

Please consider the thread closed, and thank you for your time!