I have a general question about neural networks. How does one determine how many layers and units in each layer for a particular problem or set of data? Are there rules of thumb documented somewhere? Are the layers and units derived from the complexity of the inputs and how many outputs? Hopefully, it’s more deterministic than just trial and error.
Experimentation guided by experience.
You gain experience by doing a lot of experimentation.
There are lots of rules of thumb. There is little documentation, because that just causes arguments among the ML experts over whether the thumb-rules apply to all situations.
The number of outputs unit is obvious - it’s the number of labels in the dataset.
You can’t easily assess the complexity of the inputs to make any decisions in advance about the number of hidden layer units.
Start with one hidden layer. Adjust the number of units. Only add more hidden layers if you can’t get good enough performance with one - and only add more hidden layers if you have enough data to be able to train all those combinations of features.
For simplicity, initially keep the number of units in each hidden layer the same. That limits the number of permutations you need to evaluate.
When you get “good enough” performance, stop and congratulate yourself. Only chase marginal gains if it gives some meaningful advantage (it usually won’t). Changing the accuracy by a percent or two is not really useful. You’ll get that amount of variation just be re-shuffling and splitting the data set again.
It’s about experience one gain after many attempts of trial and error. Do you agree?