Q1) At 1:21, Andrew explains that the training error will increase as the model gains more experience. The reason is that it becomes more difficult to fit a model as more data points are added. This makes sense. However I have trained models of my own and observed that the training error can decrease because it updates its weights based on the training data and performs better on it with time. This also makes sense. So I suppose my question is what is the general trend of the training error to expect?
Q2) At 4:34, we see that the validation error decreases in case of underfiting. But in underfiting case, model performs bad both on training and validation data. So shouldn’t the validation error also increase meaning a similar shape to the training error curve?
Regarding your questions:
Q1) The general trend to expect depends on the context. If you focus on the dataset size (as Andrew does in Learning Curves), the training error will generally increase with more data. However, during the training process, the training error usually decreases as the model improves.
Q2) Andrew explains the learning curve for a model with high bias (underfitting). In this case, both the training error and the validation error are high. However, the validation error decreases initially as more data is added, before plateauing. This is because when the model is trained on very few examples, it generalizes very poorly and the validation error is large. As more data is added, the model’s generalization improves slightly, leading to a reduction in the validation error. Eventually, however, the validation error plateaus because the model is too simple to capture the underlying patterns.
Thus, while underfitting results in poor performance on both training and validation sets, adding more data can still slightly reduce the validation error early on, before it flattens out. The curves don’t mirror each other perfectly because the model’s ability to generalize may improve with more data, even if it remains underfit.