I have a confusion, lets say I am training 2 layer NN (One Hidden and One Output layer) having 3 neurons in hidden layer. How come different neurons in the hidden layer learn the different weights. As all the weights does not matter what their initial value should come to the global minima means the value of all the weight of all the neurons should be the same. What am I missing can someone please clarify.
You are missing the fact that this is a high dimensional space and you need a parameter for each dimension.
Hello @Shubham219 ,
Welcome to the discourse community. Thanks a lot for your question. In my reply, I will do my best to give a response to your question.
You are missing the fact that different neurons in a hidden layer can learn different features of the input data. For example, if you are training a neural network to classify images of cats and dogs, one neuron in the hidden layer might learn to recognize the shape of a cat’s ears, while another neuron might learn to recognize the shape of a dog’s snout. This is because each neuron in the hidden layer is connected to all of the neurons in the input layer, and each connection has its own weight. The weights are adjusted during training so that the neural network can learn to classify the input data as accurately as possible.
Each neuron learns different weights during training. This is because the weights are initialized randomly, and each neuron receives a different set of inputs from the previous layer. As a result, each neuron learns to recognize different features in the input data and adjusts its weights accordingly to optimize the network’s performance.
If all of the neurons in the hidden layer had the same weights, they would all be learning the same feature of the input data. This would make it difficult for the neural network to learn to classify the input data accurately.
To prevent all the weights from converging to the same value during training, the weights are initialized randomly and updated using different learning rates. This allows each neuron to learn different weights and prevents the network from getting stuck in a local minimum.
Here is an analogy that might help you understand why different neurons in a hidden layer can learn different features of the input data. Imagine that you are trying to learn to recognize different types of cars. You could do this by learning to recognize the shape of the car’s body, the shape of the car’s headlights, the shape of the car’s taillights, etc. Each of these features is a different way to recognize a car. In the same way, different neurons in a hidden layer can learn different features of the input data.
In summary, each neuron in a neural network learns different weights because the weights are initialized randomly and each neuron receives a different set of inputs from the previous layer. This allows the network to learn complex patterns in the data.
I hope I was able to help you with your question. Please feel free to reply and post a followup question if you feel uncertain.
Regards,
Can Koz