Open question: Regarding the working of a layer in the neural network

I’ve just finished watching the videos of MLS Course 2 Week 1. In the example video, Prof. Ng mentioned that the hidden features calculated by each node in the first hidden layer are distinct, even though all the nodes in that layer make use of the same activation function and are fed with the same data. I’m pretty sure, there are few other parameters which I might not have realized yet. Can someone explain how the data distribution (and in this case extraction of various features) was done across the nodes of the layer?

Thanks,
Eakanath

Hello @eix_rap

Welcome to our community.

Even though the data that is fed to each node/neuron is the same and the activation function is the same, each neuron starts with different random values for the weights. By doing so, the trajectory followed by each neuron becomes different through the learning cycle. Consequently, every neuron ends up with different final values for the weights (once the learning is completed). In this way, every node or neuron learns different hidden features.

Thanks for the explanation @shanup. A follow-up question of mine would be, what are the different techniques used in practical works for the distribution of weights

We just need to start it out with random values and the learning algorithm takes care of the rest.