Do hyperparameters have some sort of multicollinearity/dependencies between them?
Because if, for the sake of argument we assume that they are totally independent of each other, then we wouldn’t need ALL combinations using Grid Search right? We could just do it as shown in the lab, tuning one hyperparameter at a time using training and validation accuracy plots. Looking for some intuition around this. Thank you
The lab explored these two hyperparameters: max_depth , min_samples_split. Speaking of interdependence, do you think any value of min_samples_split can make some max_depth values meaningless? For the sake of discussion, let’s consider our training set to have 100 samples.
Of course they are inter-dependent with each other, the entire model is connected with all its parts, all its parameters are dependent on each other be it heavily or lightly. I don’t know the dependence relation in your particular case, but I am sure there is one!
Thank you @rmwkwok and @gent.spah for your quick replies. While asking the question, I was primarily thinking around features being independent of one another so the model can be robust, so I took the same thought process to hyperparameters.
Yes, I do see now that max depth and min samples split are not totally independent, they would have a tendency towards negative correlation. Again, thank you for your help.
You are welcome, @TanaySontakke! In fact, I believe that it is a necessary step for every learner to think about that interdependence when they learn about decision-tree-based models.
You have asked a great question indeed , and provided a good answer for the pair of examples we have covered here.