Choosing Neural Network architecture

In the first week lecture we are introduced to nerual network and the digit recoginiton example. I have a basic question, how do we decide the number of layers and the number of units in each layer? for example in the example we hade 3 layers and different number of units in each of them. I want to learn how did we choose this number ? how did we diecide that we will have 3 layers, first one will have 25 units second will have 15 and so on …


The number of layers and the size of each layer are determined by experimentation, guided by your experience.

You want a complex enough model to get good enough performance, but not so complicated that training is difficult or time-consuming.

Two hidden layers are not really necessary in this assignment, the second hidden layer is only here for educational purposes.

1 Like

I was thinking the same thing as Sourabh_Mishra. I just finished the course and the neural networks seem to be incredibly useful but there was never any general guidance of how to build the hidden layers and how many units to solve different types of problems. Is the 25 / 15, … a good place to start or are these only appropriate for certain types of models ?

Every data set is going to require different optimization of the NN design.

Tips (some loose rules of thumb):

  • Start with one hidden layer.
  • Start with the number of units as the square root of the number of input features.
  • Adjust the number of units to get improved performance.
  • Only add a second hidden layer if you can’t get good enough performance with one hidden layer.
1 Like