C1W2 Ungraded Lab: misclassification calculation

In the C1W2_Ungraded_Lab_Birds_Cats_Dogs.ipynb, in the cell right below the Confusion Matrix, I wouldn’t say the misclassifications were computed the right way.

Let’s take the misclassification rate of Birds as an example. In my view, it is a situation when we submit Birds to a model (one by one) and count wrong predictions (i.e., when real Birds are predicted as dogs or cats). The fraction of the number of wrong predictions (numerator) and the total number of Birds submitted for the model classification (denominator) should be the misclassification rate of Birds, IMHO.

In Python: ((y_true == 0) & ( (y_pred_imbalanced == 2) | (y_pred_imbalanced == 1) )).sum() / (y_true == 0).sum()

The formula in the notebook considers all Birds predictions and counts how many times the ground truth was cat or dog rather than a bird.

In Python: ((y_pred_imbalanced == 0) & ((y_true == 2) | (y_true == 1))).sum() / (y_pred_imbalanced == 0).sum()

Discussion: In my view, misclassification rate is False Negative Rate.
The formula in the notebook calculates False Discovery Rate.
Regarding the terminology, see Wikipedia.

Your view, guys?

Thanks for pointing this out. I’ve asked the staff to fix it.

Please use this snippet:

misclassified_birds = (imbalanced_cm[0, 1] + imbalanced_cm[0, 2])/np.sum(imbalanced_cm, axis=1)[0]
misclassified_cats = (imbalanced_cm[1, 0] + imbalanced_cm[1, 2])/np.sum(imbalanced_cm, axis=1)[1]
misclassified_dogs = (imbalanced_cm[2, 0] + imbalanced_cm[2, 1])/np.sum(imbalanced_cm, axis=1)[2]

print(f"Proportion of misclassified birds: {misclassified_birds*100:.2f}%")
print(f"Proportion of misclassified cats: {misclassified_cats*100:.2f}%")
print(f"Proportion of misclassified dogs: {misclassified_dogs*100:.2f}%")

Notebook has been updated, thanks to @zbynekb for flagging and @balaji.ambresh for coming up with the solution :slight_smile: