As I understand, when applying logistic regression, the threshold usually applied is 0.5 (meaning that if we obtain a probability superior to 0.5, then the output is considered to be 1). However, in the example where we try to identify positive/negative patients, we probably want to avoid false negatives more than false positives. Hence we might set the threshold to 0.9 for instance. Is there any way to control the risk of having false negatives? I am thinking of statistical tests where we can tune the test with respect to specific risk levels (alpha and beta risks if I remember well).
Haven’t seen this question so far in the forum, please redirect me if needed.
Thanks a lot,
welcome to our community!
The threshold is hyperparameter like learning rate so that there arn’t an self tuning that guarantees on false positives/negatives,Also you measure the accuracy of each class by confusion matrix and you will know which class the model fail on it so that you can try to change the tuning of the threshold . and personally I didn’t met statistical tests that fit and tune the risk levels. But there are other models that can be useful in case of count class is small like anomaly detection model or random forest is rule-based approach that predict based on if some condition come true
Personally I find that tuning the threshold is problematic. Any changes in the results from shifting the threshold could just as easily be achieved by simply learning better weight values.
As @AbdElRhaman_Fakhry mentions, it may be more useful to use a different metric rather than just accuracy. The F1 score is often used.