Week 1 - Question 10. Debate

"You find that a team of ornithologists debating and discussing an image gets an even better 0.1% performance, so you define that as “human-level performance.” After working further on your algorithm, you end up with the following:

|Human-level performance|0.1%|
|Training set error |2.0%|
|Dev set error |2.1%|

Based on the evidence you have, which two of the following four options seem the most promising to try? (Check two options.) "

Hi there !
I’m having difficulties to understand why there are TWO responses in this question. What I understood from this course are :

  1. Train a bigger network reduce avoidable bias (this one is ok)
  2. Get a bigger training set and/or adjust regularization reduce variance

I was wright on the first one but I cannot understand why I should reduce the variance here, the training error and the dev error are close which tells me that my model is generalizing well.
Can someone turn on some lights for me please ? :slight_smile:

Thanks !

2 Likes

I am wondering the exact same thing. My instinct was to only choose the “train a bigger network” but the question asks to pick two options.

3 Likes

I also questioned about this question. In case, we have to both lower training error by training a bigger network and reduce variance. Why shouldn’t both increasing regularization and getting a bigger training set be correct?

1 Like

Hey @Ranchana_K,

When analyzing performance of our model, the rule of thumb to achieve a low error on the train set first. It also means that we address a high bias problem first.

Check out this post on bias-variance tradeoff for details.

Feel welcome to ask further questions.

1 Like

Hello, @manifest . I have one more question. How can we train with longer time? I have heard a couple of times but not understand why it helps reduce bias. Thanks.

1 Like

When we talk about training, we usually assume that we use a minibatch method. That means we train our model in iterations and use a fraction of the training set on each iteration.

When we’ve trained our model on the entire training set (i.e. have finished the first epoch of training), we may continue training by repeating the training examples.

Because our model has never observed the entire training set, the model may continue to learn some useful information from repeated training examples. We typically shuffle training examples to increase this effect.

Training for longer time simply means using more iterations.

We should be careful though, because repeating the same training examples increases a risk of overfitting.

3 Likes

Thanks for your explanation. I gain more understandings.

2 Likes

Hello everyone.

I am also having issues with this thread, specially because I actually selected “get a bigger training set to reduce variance”, and my answer came as WRONG stating as justification that: “No. Bias is higher than Variance”. If training error is 2.0% and dev error is 2.1%, is bias Higher than Variance?

Thank you in advance
Ricardo

1 Like

All these questions are subject to a certain amount of interpretive leeway. My take on this one is that what they really meant by their “feedback” comment is not so much that “Bias is greater than Variance” as an absolute number, but that Bias is the primary problem here and the thing that you need to address as the “first order of business”. If the Human Error (or another model) can produce an error of 0.1% and your model produces > 2% error on both training and dev data, then your first task is to figure out how to lower the error. In other words “remove avoidable bias”. So a more expressive network architecture is one potentially useful step, but getting more training data will probably not help in that scenario, right?

4 Likes

I understand your point
Thank you

1 Like

Hi Paul.
I want to know how can I get all the correct answers once I pass the test. For example , I get 80 points and pass the test, but there are still three wrong answers. How do I konw what the correct answer are
Thank you in advance
Lijiang

Sorry, but there are no official quiz answers in the same way there are no official solutions to the programming assignments. If you can’t find the question discussed here on Discourse, then your only recourse is to keep trying different answers until you find the right ones. Of course that is a hassle, since a) you’re limited to 3 tries per 8 hours and b) some of the quizzes are dynamic and don’t give exactly the same questions every time. So the “exhaust the possibilities” strategy may take a while.

1 Like

Are you sure that
2. Get a bigger training set and/or adjust regularization to reduce variance
is a correct answer to the question?

I understand there’s also the option
Try decreasing regularization
in addition to
Train a bigger network [...]
which to me sounds much more logical, as poor training performance might be a consequence of both, the network being too small, and the network being over-regularized. Both would increase bias, or am I wrong?

1 Like

Maybe I’m just misinterpreting your point here, but no, that is not a correct answer to this question as I pointed out in my earlier reply. The primary problem is avoidable bias and more data won’t help with that: you already can’t fit the data you have, right? Of course the “and/or” there makes it a little ambiguous, but “reduce variance” is not the desideratum in any case, right? The point is you need more variance, so how can you achieve that? And I thought that is consistent with the feedback someone reported from the quiz grader. Meaning that they tried selecting that answer and it was marked incorrect.

Yes, that sounds right. It was already agreed earlier on this thread that training a more expressive network is clearly a valid approach to try. The question doesn’t really give you any information on how heavily regularization was used in the given 2%/2.1% solution, but if it is used then it would be a valid approach to try less of it. Did you try that answer and get a bad response from the quiz grader? Of course just reducing regularization is not likely to be a complete solution, since it may increase the difference between the train and test accuracy.

1 Like

I might have misunderstood the topic myself, but I was under the impression that the initial question implied the two points given in the first post were both (correct) answers to the quiz question.

The intended question then would have been why the second option about getting a bigger training set was correct despite being counter intuitive. If so, the premise of the whole thread might have been wrong, since what was quoted as checkbox 2 was wrong to start with. The second correct answer should have then been the option that I stated, about decreasing regularization and therefore increasing variance, which was yet another one of the options in the quiz.

I did in fact get the answer wrong in the quiz, but for different reasons… :sweat_smile:

1 Like