Help with class imbalance in mammogram dataset when fine-tuning EfficientNet?

1blak_Boy · May 26, 2026, 1:32pm

I’m working on a binary classification problem for breast cancer detection using mammogram images. I’m fine-tuning a pretrained EfficientNet model.

Dataset distribution:

Train set:

Benign: 1569
Malignant: 803

Validation set:

Benign: 448
Malignant: 66

Test set:

Benign: 208
Malignant: 128

The dataset is moderately imbalanced overall (~2.2:1), but the validation set is heavily skewed toward the majority class.

What I’ve tried / considered

Data augmentation for minority class

My main concerns:

What is the most reliable approach for handling this type of imbalance in medical image classification?

Should the validation set be rebalanced, or should it reflect real-world distribution?
Which metrics should I prioritize for this problem (recall, AUC, F1, etc.)?

My main goal is to maximize malignant class detection (recall), since false negatives are critical in this case.

What I’m looking for:

Best practices or recommended training strategy for handling this kind of imbalance when fine-tuning EfficientNet on medical imaging data.

gent.spah · May 27, 2026, 10:03am

You’re thinking about the problem in the right way.

Data augmentation is always a good practice, especially for image data, and it can help with imbalance by enriching the minority class.

For medical classification problems like this, the key is not to rely on accuracy. You should focus on precision, recall, and F1-score, especially recall for the malignant class since false negatives are critical.

This topic is also covered in the Machine Learning Specialization—there’s a unit on evaluation metrics for imbalanced datasets that’s worth revisiting.

Overall: prioritize the right metrics, not just model performance.

balaji.ambresh · May 27, 2026, 10:38am

Have you seen this?

ai_curious · May 27, 2026, 10:50am

This excerpt in the weighted initialization section of the tutorial linked above seems particularly relevant to the OP’s objective…

The default threshold of t = 50% corresponds to equal costs of false negatives and false positives. In the case of fraud detection, however, you would likely associate higher costs to false negatives than to false positives.

Related to weighting is the idea of weighting directly in the loss function. It seems to have been explored in several medical imaging studies that you can find on the interweb. For example

Topic		Replies	Views
Optimizing Model - Imbalance? AI Discussions ai-discussions , project	6	150	February 2, 2024
Metric for classification assignments Week2 Convolutional Neural Networks coursera-platform	3	536	August 20, 2022
Is %97 accuracy too much for a simple model? AI Discussions ai-discussions , model-customization , project , ai-question	8	526	March 3, 2024
Sampling strategy in case of imbalanced data AI Discussions ai-discussions	2	176	March 18, 2024
Over 90% accuracy but wrong predictions AI Discussions ai-discussions	14	1116	April 16, 2024

Help with class imbalance in mammogram dataset when fine-tuning EfficientNet?

Related topics