I’m writing in regard to the quiz in week 2 course 1 on MLOps specialization.
The first question is saying that 98% of patients don’t have the disease, then the second question state that in above example we have 98% positive examples (so positive is assigned to healthy patients from these two sentences), then we read that if we assign everyone to class 1 we recognized them all to be ill.
I think these three parts are contradictory
Knowing usual notation I’m assuming that second statement is wrong and should be “98% negative examples” - usualy disease is assigned to class 1 (positive)
I share with you what I catched reading the first two questions.
About the first question the key point is that the dataset is really skewed. So we need to combine precision and recall in just one metric: F1 score
About the second question the problem is always the same (detect if the patient has a specific desease or not), but now we have 98% positive examples (everyone has the desease, TP=98%, TN=2%). The statistics has been inverted in this case.
If the ML algorithm return always ‘1’ FN=0 we can evaluate the recall as
recall = TP/(TP+FN)=1 or 100%.
Anyway you are right, the text seems to be contradictory and could be improved.
Cheers
I was struggling with the same question. Suggestion for correction: “On the previous problem above, but with 98% positive examples…” ← this would clarify that the ratios are inverted.