Hyperparameter Tuning

Lina_Hourieh · July 31, 2022, 11:47am

Hello,
How can we know how important a certain hyperparameter is compared to the other?
Prof.Andrew stated that tuning the learning rate (alpha) is more important than tuning the number of hidden layers. How did he figure that out?

Elemento · July 31, 2022, 12:53pm

Hey @Lina_Hourieh,
Well, that’s a nice question. I guess one way to look at this is to find out how much each of the hyper-parameters can influence your model’s outputs, and assign the relative importance to the hyper-parameters in accordance. For instance, certain hyper-parameters like the number of neurons in each layer, number of layers, activation function in each layer, etc often affect the model’s outputs to a relatively small extent, as compared to other factors like learning rate, which may affect the model’s outputs to a larger extent, and hence, we focus more on hyper-tuning learning rate. Now, this relative importance, you have to find out after performing lots and lots of experiments.

Moreover, with more and more practical results, one often tend to give less importance to certain hyper-parameters and more to others. For instance, after you will train a lot of neural networks by yourself, you will get some experience regarding the number of layers for a neural network as per the task, and you might start your model with good initial values. Now, in some cases, you might need to increase and/or decrease the number of layers, and that’s perfectly acceptable, but in most of the cases, you will find that the initial number of layers you started with are quite acceptable.

However, I am assuming that when we are comparing 2 hyper-parameters for their importance, we are trying to assign as good of initial estimates as possible. For instance, if someone starts with a neural network of 1 layer, and then without any hyper-tuning the number of layers, only hyper-tunes the learning rate, and then expects the results to be good, well, the person is in for bad luck!

To conclude, as your experience grows, you will get to learn good initial estimates of certain hyper-parameters, and thus, you can easily focus on the rest, and keep these certain hyper-parameters on the back seat.

It’s more like you being a neural network, your experiments being the input to the model, and you are learning the weights, which are analogous to initial estimates of certain hyper-parameters, and as the iterations (experience) increase, you learn to produce better outputs (better neural networks with less hyper-tuning). I hope this helps.

Cheers,
Elemento

Lina_Hourieh · August 1, 2022, 9:27am

Very satisfying answer …Thank you

Topic		Replies	Views
Question about strategics Structuring Machine Learning Projects week-module-1	3	55	October 29, 2025
Activation functions as hyperparameters Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	583	September 14, 2021
Hyperparameters - multiple questions Improving Deep Neural Networks: Hyperparameter tun week-module-1 , coursera-platform	3	32	August 27, 2024
Criteria for deciding number of layers in training model? Advanced Learning Algorithms week-module-1	4	711	February 12, 2023
Hyperparameter optimization Improving Deep Neural Networks: Hyperparameter tun week-module-2 , coursera-platform	4	69	September 22, 2024

Hyperparameter Tuning

Related topics