How do units within the same layer end up with different weights?

Christopher_Badman · July 28, 2022, 11:51am

This thread is definitely a duplicate of the thread you linked in your answer

It took a while to understand why different initial values would result in different weights. The key seems to be that the rate of gradient descent for each feature is dependent on it’s initial weight. As such, if each neuron has differing initial weights, the rate at which their weights are being updated via gradient descent will also differ, and this might cause a particular neuron to approach a different local minimum then other neurons.

I think my assumption that all neurons would converge to the same weights was based on the assumption that there’s some single minimum that can be reached regardless of the “direction of descent” (i.e. the classic “soup bowl” of a two feature scenario).

If this is the case, then I would still assume all neurons would converge to the same weights, but I guess in reality, especially with more features, such a “single minimum” is extremely unlikely.

Or would the neurons in a 2 feature “soup bowl” scenario still end up with different weights somehow?

Topic		Replies	Views
MLS-C2-Week1 Behavior of each neuron Advanced Learning Algorithms week-module-1	4	540	June 29, 2022
Question in intro video Advanced Learning Algorithms week-module-1	4	15	December 20, 2024
How will two units in a dense layer reach different weights and biases? Advanced Learning Algorithms week-module-2	1	303	October 27, 2023
Sigmoid function within each layer Advanced Learning Algorithms week-module-1	6	574	November 27, 2022
Why do nodes in a hidden layer produce different results (when they are based on the same inputs)? Advanced Learning Algorithms week-module-1	3	593	July 27, 2022

How do units within the same layer end up with different weights?

Related topics