Doubt in non-graded portion in week2 ResNets programming assignment

The fundamental issue with all the assignments in the courses here is that there are severe limitations on the compute and storage resources that can be used by the course notebooks. So the training sets are all unrealistically small. In order to get good “generalizable” models, your training data needs to reflect the statistical distribution of the “real” inputs your model needs to handle well.

As one example of this phenomenon, remember back in Course 1 Week 4 where we trained the “cat recognition” model with literally 209 training samples and 50 test samples. By “real world” standards, that is a laughably small dataset. Here’s a thread which does some experiments to try to perturb that dataset a little and show that it’s actually very carefully curated to work as well as it does.