Questions about different definition

Hi friend,

I am watching " Bias and Variance with Mismatched Data Distributions", I am confused about some definitions.

Q1. For training error: My understanding is, lets say you have 1M # sample to train the NN, after train, you use this 1M training sample to test this NN, it shows 99% accuracy, then the training error is 1%. Is this correct?

in the slide as pic below:

Q2: data mismatch just means different distribution between training and dev/test set. Like pro high res cat pic vs blurry mobile cat pic. Is this correct ?

Q3" Error on examples trained on," I am not follow here, what caused it to happen? Is this distribution problem again?

Q4, “error on example not trained on”, what caused it to happen?

especially for Q3,Q4, professor Ng was talking about distribution, can someone give me more detail? thank you!

Q1 - Yes

Q2 - Yes. This whole section is dealing with the case in which the training data and the “dev/test” data come from different distributions.

Q3 - That is just “training error” by definition. But note that this is the sophisticated case in which to help get around some of the issues with the different distributions between training and dev/test, you subdivide the training data into the “training set” and the “train-dev set”, which is a subset of the training data. In that case you only train on the “training set” subset.

Q4 - That is the error you get on the “train-dev” set, which you don’t use for training.

If my explanation is not enough, I suggest you watch that sequence of lectures again with what I said above in mind and I bet it will make more sense the second time around. Prof Ng is the best teacher: if you don’t get what he says the first time, it’s worth listening again. Course 3 covers a lot of sophisticated issues.

thank you, Paul ~~~~~