W3_Initialization with identity matrix

Boken_Lin · December 3, 2022, 4:43pm

Hi, I just finished watching DLS1-W3’s video on random initialization. The video mentioned that if we initialized the weights to a zero matrix, then each neuron within the layer would learn the same thing. However, why is initializing to a random, non-zero matrix better than initializing to an identity matrix (or the identity matrix multiplied by a small number)?

Thanks!

paulinpaloalto · December 3, 2022, 5:47pm

The “weight” matrices here are typically not square, so the identity matrix is not really an option. Unless you meant a matrix full of ones. But the point is any uniform value will give the same results for every neuron. So the problem is not really zero values per se: it is “symmetry” of the initialization that is the problem. Here’s a thread about the mathematics behind why “symmetry breaking” is required.

It turns out that there are a number of different algorithms you can use for random initialization, but that is a more advanced topic that Prof Ng will cover in Course 2 of this series, so please “stay tuned” for that.

Boken_Lin · December 3, 2022, 6:24pm

Thank you! That’s very helpful.

Topic		Replies	Views
Week 3 Random Initialization Neural Networks and Deep Learning coursera-platform	6	675	May 6, 2022
Weight matrix initialization Neural Networks and Deep Learning coursera-platform	2	707	July 20, 2021
Clarification on Zero Initialization in Neural Network Linear Regression Neural Networks and Deep Learning coursera-platform	3	1200	November 16, 2023
Course 2 Initialization with zero weights and none zeros bias Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	571	May 14, 2021
Initializing Weights Neural Networks and Deep Learning coursera-platform	1	609	September 6, 2022

W3_Initialization with identity matrix

Related topics