Initial Parameter Values in Neural Networks (Deep Learning Special Course, Course 1 Week 3)

paulinpaloalto · October 24, 2022, 2:52pm

Yes, you are correct that you can “break symmetry” by making the W values constant and the b values random. My guess is that the reason the common practice is to use W as the random values is that it must give better convergence in most cases. You can try some experiments and see if you can see any difference. Here’s a thread from a while back that discusses Symmetry Breaking in more detail.

Note that there are a number of different possible random initialization algorithms. They show us a very simple one in Week 3 and Week 4 of Course 1. But it turns out those straightforward algorithms do not always work very well. Prof Ng will show us some more sophisticated initialization algorithms and discuss these issues in more detail in Course 2, so stay tuned for that. I point this out to give some background on my comment that there may be a reason for not using the bias values for symmetry breaking. Initialization matters for the performance of convergence and there is no single “silver bullet” solution that works best in all cases.

Topic		Replies	Views
Week 3 Random Initialization Neural Networks and Deep Learning coursera-platform	6	705	May 6, 2022
Parameter Initializatio Neural Networks and Deep Learning coursera-platform	1	682	October 14, 2021
Randomly initialize parameter b instead of W Neural Networks and Deep Learning coursera-platform	6	691	August 23, 2022
Symmetry Breaking versus Zero Initialization Neural Networks and Deep Learning week-module-3 , coursera-platform	5	12852	August 10, 2021
Course 2 Initialization with zero weights and none zeros bias Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	614	May 14, 2021

Initial Parameter Values in Neural Networks (Deep Learning Special Course, Course 1 Week 3)

Related topics