[final assignment] Wouldn't training data accuracy always be 100%?

In one of the cells in the last exercise, the notebook takes a look at accuracy across training data. 2 questions:

  1. Wouldn’t this always be 100%, since the model learned off of this data? In other words, if the model’s already seen this data, wouldn’t it just…memorize it?

  2. The result in the cell is 99.99% - in what cases would it be lower than ~100%? Is it ever possible to see a DL model accuracy of like, 50% when looking at accuracy of predicting training data?

You can drive the training accuracy very close to 100% through excessive exposure to the training data set. However, that turns out to be a bad thing. Generally when training accuracy trends that high, validation accuracy trends worse. The term you may see regarding this phenomenon is overfitting. Here is a graph showing the relationship between training, validation, and number of training iterations. Note: this plot uses error instead of accuracy so the concavity is inverted.

image

What you really care about is the validation accuracy, not the training accuracy, since that is an indication of how well the model training generalizes to previously unseen inputs.

It is possible to see low training accuracy if the model isn’t learning. This could be due to a poor model architecture or choice (or implementation) of cost function.

Ideally, you see a downward trend in training and validation error while the model is learning, and should stop training when the validation metrics reach an inflection point.