Week 3 activation function

Why do we need mutliple nodes in a layer like why do we need multiple yhat values??

For a binary classification problem, the output layer has only one neuron to give the binary “Yes/No” answer. The multiple \hat{y} values are the answers for the different inputs (“samples”), which is to say the values of that single output neuron for the given set of inputs.

Then in the hidden layers, we have more than one neuron. The point is that each neuron in the hidden layers is trained to recognize something different. The more complex the inputs are, the more neurons we may need in order to “see” all the different things that need to be recognized. The reason that the neurons are different is that we initialize them differently for “Symmetry Breaking” and then let them learn through the gradients and back propagation. Here’s a thread which talks about Symmetry Breaking in more detail.

The next interesting and important question is “how do we decide how many neurons we need?” Prof Ng talks about that in the lectures here and will spend quite a bit more time on this in Course 2 of this series. The basic idea is that you have to try and see what works. That is to say you figure it out by running experiments.

1 Like