How to come up with the architecture of a neural network?

I understand how to use cross-validation data to tune degree of polynomial or regularization, but how to come up with the initial set of different architectures of a neural network to tune? It seems, how many neurons and how many layers have countless combinations. Is there a guideline of how to set up the architecture of a neural network?

Hello @Jinyan_Liu,

It is a similar problem to how we should come up with the initial numbers of degree in a polynomial problem. There are countless combinations too, for example, we could have w_1x + w_2x^2 + w_3x^3 or w_1x + w_3x^3 or many more.

In neural network, we are facing a similar challenge.

What do you think?


Oh right! Thanks! So I guess all of that will depend on experiences more?

The final decision is, as you said, based on cross-validation. As for the initial choices, experiences always play a part, and you can also design it. If, in the future, you are interested to do the Deep Learning Specialization, then in its Course 4, you will come across a few popular architectures and hear how they were justified. Those justifications were the designs.

If, in the future, you become passionate in designing architecture, you will also start to investigate what each neuron does, how the neurons are different from one another, and so on, by examining the neurons’ weights or neurons’ output with respect to some particular inputs. However, I am not going into the details here.

As for the 3 architectures you shared, I recommend you to try them out in this Tensorflow Playground. Although you can’t set that many units in a layer, I believe you can downsize those architectures while maintaining their relative differences. If you agreed that experiences are important, perhaps the Playground is a good place for you to build up some without bothering to code :wink: Try to reason them :wink:



Thr following paper told us how we might examine the weights. The paper also included references to other relevant published works.

I just wanted to say that we can look at the weights and try to discover something about it, but I don’t mean to suggest you to read it. However, one day, your passion might lead you to something like that again. :wink: For now, I think the tensorflow playground will do, and we would better focus on the playground.



Thank you for all the suggestions and sharing! Now I am thinking of taking Deep learning specialization after this one. :laughing: