Ask a question about neural networks

Week 1

I would like to ask, how does the each neuron in a specific layer output a different value than the other neurons in the same layer though all of them is from the same model and got the same training set, so the same gradient descent so the same value for the parameter which will make the same output. How does it output different values?

That is because we initialize each of the neurons at the beginning of training in a random way, so that each one starts the training with a different value. That is called “Symmetry Breaking” and it is critical, precisely because it allows each neuron to learn different things. Here’s a thread about Symmetry Breaking.

1 Like

Yes but still, after doing the gradient descent algorithm, it will end up with the same w and b. So that will make the same output, correct?

No, the individual elements of W^{[l]} (W^{[l]}_{ij}) will be different. That is because they start different and then the gradients will be different as well.

1 Like