Hyperparameter tuning: best-of

Laurence Moroney recommends the courses of Andrew Ng to learn more about Hyperparameter tuning in the end of the course. Though, he taught many. Do you have a specific recommendation?

Considering where the last assignment of the course left off, I feel this another essential skill to master. I am aware of a broad set of opportunities to optimize a model, potentially. Though I am missing a clear framework (/ best practices) in terms of structuring and prioritizing this task well.

Some example questions:

  • Should I first optimize the lr and then the batch size, and then the optimizer, and then, …
  • At what stage do I focus on optimizing the nn architecture, rather than the hyper parameters?
  • How can I optimally employ tools such as keras tuner, at what stage? Should I rather optimize one hyperparameter at a time, or a lot in conjunction (due to interdependencies)?
  • How to make the best of own sanity vs hyperparameter search tools?

Clearly, not everything has a black or white/ universally true answer. But I am sure, there is a lot one could learn from other’s experience and well-proven routines in this regard!


Please take up deep learning specialization

1 Like

Thank you so much for sharing your suggestion! I had the opportunity to take the specialization a few years ago and found it incredibly insightful. Recently, I revisited the specialization and explored some additional materials that have deepened my understanding even further.

After several weeks of experimentation, I discovered that GridSearch, Bayesian optimization using KerasTuner, and Optuna (with various samplers and pruners) are exceptionally beneficial. I hope this information proves to be helpful to my fellow TF/ML classmates as well! :blush:

Wishing everyone continued success in their learning journey!