C2W3 Video Evaluating a model - Classification fraction

Hi,
In the course 2, week3, 2nd video (Evaluating a model), at about 09:26 it is told that Jtest is the fraction of the test set that has been misclassified. The same for Jtain.

I understand that we count the number of misclassified example in the training set and the test set, and that we can compute the fraction of misclassified examples for both set based on the respective number m_train and m_test.

But I don’t understand why J_test and J_train should represent the same fraction of misclassified example.

Can you please explain what am I misunderstanding?
Thanks

In theory the test performance of the (ideal) model should be the same as the train performance. But this never happens in practice. Test performance is almost always worse than train performance.

Hi, thank you.

Yes, I think I understand what you are telling me, but I think it was not my question.
In the video Andrew explain that the Jtest(w,b) is the fraction of the test set that has been misclassified and that Jtrain(w,b) is the fraction of the train set that has been misclassified.

For my understand the math is not working.
For example for the train set, the fraction that has been misclassified would be: count(y-hat <> y_train) / size(train set). But not Jtrain. Or the expression of Jtrain on this slide is not the same as the previous one.

It’s not technically a “fraction” in the mathematical sense.

Maybe “proportion” or “measurement” or “metric” would have been better choices.

ok. may be we can consider it like this.

Now I would tell the cost function using cross-entropy is a measurement or metric of prediction error, not a fraction (this is misleading honestly).
Thanks

2 Likes