Course 3 Week 2 Quiz Question Phrasing

nick_valverde · December 15, 2022, 7:10pm

Hello, I just took the quiz and there was some wording on a question that irked me.
The question reads:

The distribution of data you care about contains images from your car’s front-facing camera, which comes from a different distribution than the images you were able to find and download off the internet. The best way to split the data is using the 900,000 internet images to train, and divide the 100,000 images from your car’s front-facing camera between dev and test sets. True/False?

I looked at this for a bit wondering if you mean a 50-50 split or just, in general divide the data among dev and test. I think this question would be better if rephrased to specify the exact division you mean.

Juan_Olano · December 15, 2022, 7:22pm

HI @nick_valverde ,

I would argue that the question is properly worded. There was a section in this week where you learned about some criteria to distribute data among the different datasets. I would ask you: would you use 100,000 images for the dev and test sets?

Thoughts?

Juan

nick_valverde · December 15, 2022, 7:55pm

I understand you would and the rule of thumb is an 80/20 split. However, the question doesn’t ask whether or not you would split the data.
I’m not sure how to explain without giving the answer so feel free to delete this.

Answering True results in a wrong answer. The feedback says that you want to split your samples using the 80/20 rule.
At first I looked at the question thinking that it means an equal division of samples which is False. But it does not say “equally divide” or give a quantifier. It just says “divide among.”
The next question that comes after says “You have finally decided to split the data …” then gives the breakdown. So I thought it meant to split the samples in general.
Adding a quantifier to the question would get rid of the ambiguity.

Juan_Olano · December 15, 2022, 10:05pm

I guess the author of the quiz didn’t want to word it in a very obvious way. The wording, in my opinion, makes it a bit harder and I personally think that this is good - at least in my case, it made me think twice and, if memory serves, I think i failed it the first time.

Topic		Replies	Views
Course 3 Quiz Question 5 Clarification Structuring Machine Learning Projects	1	506	April 14, 2023
Week 1: train/dev/test split Improving Deep Neural Networks: Hyperparameter tun	5	528	December 19, 2022
Conflicts in Course3 W1 quiz Q14 Structuring Machine Learning Projects	2	612	July 26, 2023
C3W2Q5 Why my answer is incorrect? Structuring Machine Learning Projects	1	547	November 11, 2021
Train_dev_test split doubt Structuring Machine Learning Projects	2	539	September 21, 2022

Course 3 Week 2 Quiz Question Phrasing

Related topics