In the workbook of C2W1 it surprises me to see that the accuracy of the validation set thresholds with the number of epochs while the loss is increasing. Aren’t these quantities related? Shouldn’t accuracy decrease if loss increases?
P.S. I am totally aware that the increasing loss is a cause of overfitting. I am just
surprised that accuracy isn’t decreasing.
Which workbook are you referring to?
Accuracy tells you how well the model is classifying the data points. Loss indicates how confident the model is when predicting the class of points.
Consider 1 batch of 3 points where all of them have the following predictions about the correct class:
>>> import numpy as np
>>> loss = lambda preds: -np.mean(np.log(preds))
>>> y_pred_weak_confidence = np.array([.51, .52, .56])
>>> loss(y_pred_weak_confidence)
0.6356965053077905
>>> y_pred_better_confidence = np.array([.99, .95, 1])
>>> loss(y_pred_better_confidence)
0.020447876747017344
>>>
You’re welcome.
2 things:
- The graph doesn’t have a legend explaining what line colors orange and blue refer to.
- To repeat, which notebook are you referring to?
2 It is the only workbook in C2W1, i.e. “C2/W1/ungraded_lab/C2_W1_Lab_1_cats_vs_dogs.ipynb”
1 The plot is being generated in the workbook. The x-axsis is the number of epochs, y axsis is accuracy in first plot loss in the second plot. Blue line refers to training data, yellow to test/validation data.