Can someone help me understand, how the number of nodes in a layer is determined?
For example, in the Week3 videos, there is an example of a NN that has 3 features in the input layer, 4 nodes in the hidden layer and 1 node in the output layer. Why 4 nodes in the hidden layer? What decides this? Does the output layer always have 1 node?
The number of nodes, and even the hidden layers, are determined by the architect of the NN: by you, by me, by the person or group that is creating the NN.
There are several options:
You can get started with a known model which has been created by some researchers, and this model proposes already a certain amount of layers, and nodes inside of each layer.
You can start from scratch.
2.1 Assuming you are a very experienced ML designer, may be you’ll get started with a configuration that, from experience, it is the best layout for the task at hand.
2.2 Assuming you are a novice, you may start with your best guess. For instance, 2 hidden layers, each with 4 nodes.
In any case, you’ll design your NN, train it, and watch the results. If the results are not meeting your objectives, then you start ‘fine-tuning’ your architecture (although fine-tuning the data may be more important, but that’s another topic). At the end you will hopefully reach a design (number of layers, number of units inside each layer, etc) that meets your objective.
The other question is:
“Does the output layer always have 1 node?”
No, not always. You’ll soon find out that the 1-node output is for the very specific case of a binary-classification. That is, when you want to determine if something is of a given class or not. Like for instance: Cat / No-Cat.
You’ll learn about many other types of NN that provide many other types of outputs.
I hope this sheds light on your questions. If you still have any doubt, please don’t hesitate to share it.