How to choose number of hidden layers?

Devarsh_Mavani · February 11, 2023, 5:42am

When we are making a neural network, how do we decide on number of hidden layers and number of units inside of hidden layers? For example recently I was working on the loan dataset where I had to predict if the customer will be able to pay the loan. There were about 15+ features (Such as salary, age, previous loans, etc).
I am curious what is the deciding factor for number of hidden layers. I completed the ML Specialization recently, but did not found answer to this in any of its course.

Christian_Simonis · February 11, 2023, 8:12am

Hi @Devarsh_Mavani

welcome to the community!

Here you should find a similar thread where the question is answered: Dimensioning a neural network - #2 by AbdElRhaman_Fakhry

In summary:

there is no right or wrong, but you can think about if the dimensional space which is spanned by the neurons would be sufficient to solve your problem. Finally it’s your task to design a good architecture, also with trial and error. I have seen it often, that ML engineers rather tend to increase the feature dimensions (or number of neurons) in the first hidden layer compared to the input layer which can be interpreted as giving the net more opportunity to learn complex and abstract behaviour.

Source

For example you can:

check and quantify if these feature contain mutual information resp. remove redundant information (w/ Principal Component Analysis (PCA) or Partial Least Square transformation (PLS)) to enhance your ratio of data to features;
also check the feature importance to focus on the most meaningful ones!

More details on PCA, PLS, importance calculation etc. can be found here: https://github.com/christiansimonis/CRISP-DM-AI-tutorial/blob/master/Classic_ML.ipynb

Please let me know if you have any open questions.

Best regards
Christian

Isaak_Kamau · February 11, 2023, 9:08am

Hello @Devarsh_Mavani
In addition to what @Christian_Simonis has said, DeepLearning.AI still has another interesting course that explains Hyperparameters tuning and searching for the best model architecture to use. Check it out Machine Learning Modeling Pipelines in Production

Devarsh_Mavani · February 11, 2023, 10:12am

Thanks for your reply @Christian_Simonis.
I went through the links and what I understood is that there is no mathematical way of arriving at no. of layers and neurons. It is completely based on trial and error. If it is the case, Then won’t it be computationally expensive to make something as powerful as chatGPT? and also time consuming

Isaak_Kamau · February 11, 2023, 10:45am

@Devarsh_Mavani For such massive models, there is a technique used called Neural Architecture Search
Which utilizes KerasTuner

KerasTuner is an easy-to-use, scalable hyperparameter optimization framework that solves the pain points of hyperparameter search. Easily configure your search space with a define-by-run syntax, then leverage one of the available search algorithms to find the best hyperparameter values for your models. KerasTuner comes with Bayesian Optimization, Hyperband, and Random Search algorithms built-in, and is also designed to be easy for researchers to extend in order to experiment with new search algorithms.

You can read more about it in this papers

ai_curious · February 11, 2023, 2:06pm

@rmwkwok might be interested in this as NAS came up recently in a different thread

rmwkwok · February 12, 2023, 12:47am

Thank you @ai_curious!

Topic		Replies	Views
Choosing Neural Network architecture Advanced Learning Algorithms week-1	3	389	November 13, 2023
Number of units per hidden layer Unsupervised Learning, Recommenders, Reinforcement week-2	4	549	February 26, 2023
Dimensioning a neural network Advanced Learning Algorithms week-2	5	720	February 12, 2023
Number of units and layers in neural network Advanced Learning Algorithms week-1	4	750	February 9, 2023
How many neurons? Advanced Learning Algorithms week-2	3	661	August 16, 2022

How to choose number of hidden layers?

Related topics