Week 1 : What is the rationale for selecting number of neurons in each layer?

In the week 1 course of Advanced Learning Algorithms, it is mentioned that the digit image recognition has 25 units in layer 1, 15 units in layer 2 and 1 output layer(understandably). Also, some other examples used 3 neurons in layer1 and 2 neurons in layer 2.

However, I did not understand the rationale behind selecting those particular number of neurons. Is this something explained in later weeks as part of designing neural networks? Can someone please shed some light on this? Thank you.

Hello @Rajshekhar_Lolage,

It is explained in Week 3 of the course. The whole week is relevant, but I can share one of the slides from the 3rd video and discuss it here in order to give you a quick idea of what’s going to happen:

While we all want to know how to “set” those numbers, the fact is we are “choosing” them (phrased as choosing a neural network architecture in the slide) from a list of possibilities we, I can say, guess. However, those are not blind guess, and the selection process is metric-oriented.

In Week 3, we will learn the concept of Bias and Variance trade-off where Bias and Variance are both errors that we want to balance to achieve the best overall performance. To balance, we try and adjust the architecture (those numbers), and in each trial, we evaluate its performance with a so-called test set (see the bottom line of the slide): the lower the generalization error (the bottom line again), the better the performance.

There is no formula for us to calculate out the numbers to “set”, instead we should be ready to embrace the idea that those numbers are informed improvement over an iterative loop of model development.

Cheers,
Raymond

1 Like