Week 1 - Programming Assignment 1

Moutasem_Akkad · April 10, 2022, 4:47pm

Hi,

I understand that initializing the Ws to zeros will cause an effect on the outcome of our prediction given that we will have symmetry issue.

However, I am confused as to the ‘He’ initialization is doing better than the regular initialization. I am assuming that this is doing better when we are restricting the number of epochs only? Will the two methods be the same on the long run? (taking into consideration that the # of epochs will be different)

paulinpaloalto · April 10, 2022, 5:53pm

Interesting question! There are many different methods of random initialization and they are not all equivalent. Some of them work better in some (but not all) cases. He Initialization is one of the more sophisticated such algorithms. To see one example of different behavior, try going back to the 4 layer model exercise in C1 W4 A2 and notice that they gave you either He or Xavier Initialization there. If you try using the simple initialization they had us build in C1 W4 “Step by Step”, you’ll find that the convergence is really terrible, but the He init works well.

It turns out that there is no one “silver bullet” solution for random initialization that works best in all cases. As Prof Ng says in the lectures here, the choice of initialization algorithm is another “hyperparameter” that you need to choose as the system designer.

Prof Ng mentions the need for Symmetry Breaking in the Neural Net case, but doesn’t really go into the details. If you want to explore that, here’s a thread which shows why zero initialization works for Logistic Regression, but not for Neural Nets.

Topic		Replies	Views
Week 1, W initialization to large random number, and HE Improving Deep Neural Networks: Hyperparameter tun	2	524	August 31, 2021
W1 Parameter initialization assignment Improving Deep Neural Networks: Hyperparameter tun week-1	2	26	October 25, 2024
Week 3 Random Initialization Neural Networks and Deep Learning	6	671	May 6, 2022
Week 1 Assignment 1 Course 2 - HE initialization Improving Deep Neural Networks: Hyperparameter tun	2	512	July 30, 2021
Symmetry Breaking versus Zero Initialization Neural Networks and Deep Learning week-3	7	7234	January 5, 2022

Week 1 - Programming Assignment 1

Related topics