There is a subtle but serious bug in the provided Flexible CNN architecture(HP tuning with Optuna):
the classifier is created inside the first forward pass, but the optimizer is instantiated before that.
Mistake
optimizer = Adam(model.parameters()) is called before the classifier exists.
Reason
The model builds its classifier lazily inside forward().
Until then, model.parameters() only contains the convolutional layers.
Impact
The optimizer never sees the classifier parameters β classifier remains essentially untrained.
Evidence
Debug output:
Before forward:
Model params: 1520
Optimizer params: 1520
Classifier exists: False
After first forward:
Model params: 658298
Classifier params: 658298
Optimizer params (unchanged): 1520
The optimizer was tracking only ~0.2% of the model.
Fix
Force classifier creation before building the optimizer:
dummy = torch.randn(1, 3, 32, 32).to(device)
_ = model(dummy) # builds classifier
optimizer = Adam(model.parameters(), lr=...)
Performance Impact(Best Trial Result)
-
Before fix: 0.5585 accuracy
-
After fix: 0.6405 accuracy
A clear jump showing the classifier is finally being trained.