Zero initialization of weights

paulinpaloalto · October 26, 2021, 4:44pm

But the point is the derivatives turn out to be zero in that case. Remember that the gradients are the derivatives of the cost w.r.t. the parameters. If the cost (average of the loss) is not changing, then that means the derivatives are zero, right? As with everything, it always goes back to the math. Here’s a thread which discusses Symmetry Breaking in more detail. It also explains why Symmetry Breaking was not required in the Logistic Regression case.

PS that thread was linked from the FAQ Thread, which is also worth a look just on General Principles.

Topic		Replies	Views
Week 1, Programming Assignment initialization, Exercise 1 - initialize_parameters_zeros Improving Deep Neural Networks: Hyperparameter tun coursera-platform	8	841	December 15, 2023
Course 2 Week 1 PA 1: Why does zero weight cause no change in loss Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	634	January 5, 2022
Why don't weights get adjusted when initialized to 0? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	480	August 30, 2023
Zeros initialization for weights matrices Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	628	April 7, 2022
Symmetry Breaking versus Zero Initialization Neural Networks and Deep Learning week-module-3 , coursera-platform	7	10745	January 5, 2022

Zero initialization of weights

Related topics