DOE for Hyperparameter tuning

Hello,
Concerning hyperparameter tuning, we saw that a grid search is not the good way to do it mainly due to the dimension curse and to the bad properties of the marginal distributions. For both reasons a random sampling is advocated.
Then I would like to know why a space filling design of experiments is not used ? For instance a space filling LHS (Latin Hypercube Sample) would have better marginal distributions than random sampling and will be better distributed in the parameter space.
Thanks for your help,
Frédéric

3 Likes

Hi @Fredo,

That’s a really interesting topic you bring up.

I’m going to guess and say it’s simply out of scope for this course :sweat_smile: There are many approaches to hyperparameter optimization that are not discussed, including experimental design.

Space-filling designs seem to be indeed more effective than random or grid search. I don’t have any experience with them. If you do and would like to share it with us it would be amazing :slight_smile:

1 Like

Hi,
You can find here (in scikit) a 2D example with different types of design :
https://scikit-optimize.github.io/stable/auto_examples/sampler/initial-sampling-method.html
Then have a look at the differences between a basic random sampling and a maximin LHS.
I think there are no doubt especially if the number of points in the hyperparameter space is small compared to its dimension (this is generally true in deep learning) :slight_smile:

5 Likes

Thank you for the contribution, @Fredo. I think the concept of spreading out the points more systematically is worth mentioning. I’m sure many students will find this very interesting.