Why don't weights get adjusted when initialized to 0?

James_Goff · August 29, 2023, 10:07pm

I’m doing the weight initialization lab where we see that initializing weights to zero causes the network to fail to break symmetry. For context here’s the text I’m looking at:

I don’t understand the last sentence. Why does the prediction always being 0.5 mean that the weights don’t get adjusted?

TMosh · August 29, 2023, 11:01pm

The statement is a bit of a simplification. In training we don’t really care about the cost or the predictions directly - what we care about is the errors (y - y_hat), which leads to the gradients.

And initializing to zero isn’t the issue - initializing to any constant value would cause a similar problem.

The issue they’re trying to demonstrate is that if the errors are all identical, then the gradients will all be identical, and all of the hidden layer units will learn exactly the same thing.

Again, this is a little bit of a simplification. But I think the key issue is that the notebook explanation should be in terms of the gradients and not the loss.

paulinpaloalto · August 30, 2023, 1:05am

Here’s a thread which discusses Symmetry Breaking in general. Here’s another one.

Topic		Replies	Views
Week 1, Programming Assignment initialization, Exercise 1 - initialize_parameters_zeros Improving Deep Neural Networks: Hyperparameter tun coursera-platform	8	837	December 15, 2023
Zero initialization of weights Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	672	October 27, 2021
Concept in Initialization Assignment-Help needed in understanding Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	662	March 11, 2025
Course 2 Week 1 PA 1: Why does zero weight cause no change in loss Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	633	January 5, 2022
Doubt in answers of quiz Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	532	August 20, 2021

Why don't weights get adjusted when initialized to 0?

Related topics