How do we get diff weights and bias for a dense layer

I am bit confused about tensor flow per layer W and B.
When training the tensor flow model. How are weights and bias learnt if there are multiple units? Considering that each unit in the first layer is getting the same input vector X, why doesn’t each layer eventually determine the same weights if the input W is also the same? How does tensor flow determine the features for each unit?
Taking the example of coffee roasting example, what would be the features for the input layer’s units?

1 Like

Hello @gauravv,

In short, the neurons in a layer learn different things because they are different at the beginning. When building a model with Tensorflow, we need to initialize the weights for each neuron to some values, and by default they are initialized randomly so that it is essentially impossible for any two weights to share the same value. This initial diversity allows weights to go through different learning paths throughout the process of gradient descent, and ending up learning different things.

Above is how I would explain this without maths, but if you want some simple maths and an example to persuade yourself, you may read this.


Thank you Raymond. This was really helpful.

1 Like

You are welcome @gauravv. It’s a great question.