Zero initialization of weights

Celsa_Pardo_Araujo · October 26, 2021, 12:01pm

I didn’t understand in the first week and first assignment when they explain why it is not good to use weights to zero. What I dint understand it this : “you get the same loss value for both, so none of the weights get adjusted and you are stuck with the same old value of the weights.” But we adjust the weights with the derivative of the loss with respect to them but not with the value of the loss.

paulinpaloalto · October 26, 2021, 4:44pm

But the point is the derivatives turn out to be zero in that case. Remember that the gradients are the derivatives of the cost w.r.t. the parameters. If the cost (average of the loss) is not changing, then that means the derivatives are zero, right? As with everything, it always goes back to the math. Here’s a thread which discusses Symmetry Breaking in more detail. It also explains why Symmetry Breaking was not required in the Logistic Regression case.

PS that thread was linked from the FAQ Thread, which is also worth a look just on General Principles.

Celsa_Pardo_Araujo · October 27, 2021, 8:15am

Thanks for the reply!

Topic		Replies	Views
Week 1, Programming Assignment initialization, Exercise 1 - initialize_parameters_zeros Improving Deep Neural Networks: Hyperparameter tun coursera-platform	8	830	December 15, 2023
Course 2 Week 1 PA 1: Why does zero weight cause no change in loss Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	631	January 5, 2022
Why don't weights get adjusted when initialized to 0? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	480	August 30, 2023
Concept in Initialization Assignment-Help needed in understanding Improving Deep Neural Networks: Hyperparameter tun coursera-platform	6	661	March 11, 2025
Zeros initialization for weights matrices Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	625	April 7, 2022

Zero initialization of weights

Related topics