Module1, Setting Up your Goal: Is one test set sufficient for an adequate model performance estimation?

Thanks for the detailed response. You have described lots of great ideas. I don’t think Prof Ng mentions the concept of stratified split anywhere, but I’m pretty sure he does discuss the ideas around balanced datasets somewhere in Course 3. That is the number of samples you have for each of the possible label types in your data.

You may be working a bit too hard in the stratification case, in that if you have a large dataset and randomly select a non-trivial subset of it, then you would naturally expect that the statistical distribution of the label classes in your selected subset is very close to that of the total dataset. If it’s not, then doesn’t that simply mean that either your random sorting algorithm is not really random or your subset size is too small? But it is a good point that it worth analyzing the distribution of the label types in your various selected random subsets to make sure they are reasonable. And it may well be that even if you have some classes that are underrepresented in the overall dataset that you may get better behavior if you include more of them in the test set, as long as you can achieve that without depleting that class too severely in the training set.

But the overall conclusion seems to be that you are well equipped to handle all these issues when it comes time to tackle a serious real world problem!