Why do we use multiple neurons in hidden layer for a simple ANN?

When we have such a simple neural network, where we have all neurons of the hidden layer, receiving exactly the same input and the same number of inputs (nth features), from the previous layer. Will it be right to say that the only reason of using multiple neurons in the hidden layer is to initialize random weights and/or biases to all neurons, just so we avoid getting trapped into some local minima? Or are there other reasons of using multiple neurons?

I do understand that there could be several reasons to use multiple neurons, but my question is specifically for the neural network shown in image above.

There are a few reasons to use many neurons in hidden layers or else, but the main reason is to model the problem in a high dimensional space, because you dont how many variables are involved in the problem, you dont have a definitive number of parameters involved.

So you model it in a high dimensional space so the non linear model is better fit/approximated, sometimes if you use many neurons you could overfit.

Can you explain the mathematical details for using multiple neurons in such a neural network? Also what do you mean by this “because you don’t how many variables are involved in the problem”?

No I cant explain that, that would be a coure in itself. That quotation means you dont have an equation that governs the model with fixed and known variables, so you use the non linear neurons to find a fit or an approximation to a lets say the “equation” that governs the model under consideration.

You should check out the deep learning specilization that we have here :slightly_smiling_face:

1 Like

I have been researching and trying to understand the answer to this question that I asked and so far this is what I have. I am learning the problem of XOr with vanilla perceptron and the reason that we use multiple perceptrons, to avoid the problem of XOr so that we can perform non-linear tasks.

Is this something that is covered in the deep learning specialization? because I plan to start that specialization but only after I get this intuition right, but if it’s something that gets covered in the deep learning specialization then I’ll jump right into it.

I have been researching and trying to understand the answer to this question that I asked and so far this is what I have. I am learning the problem of XOr with vanilla perceptron and the reason that we use multiple perceptrons, to avoid the problem of XOr so that we can perform non-linear tasks.

Is this something that is covered in the deep learning specialization? because I plan to start that specialization but only after I get this intuition right, but if it’s something that gets covered in the deep learning specialization then I’ll jump right into it.

Yes it explains that and much more.

1 Like