W1 Parameter initialization assignment

jakhon77 · October 25, 2024, 4:37am

Should I use “He” weight initialization technique ONLY with very deep neural networks (10-150 layers) or should I use it even for the neural networks that have 2-4 layers? Thanks for the clarification.

TMosh · October 25, 2024, 6:24am

It is not limited to deep networks.

paulinpaloalto · October 25, 2024, 5:18pm

As Tom says, this technique is generally applicable. There is no single initialization algorithm that works best in all cases, but He initialization is one of the first to try and works well in many cases even when the network is not deep.

In fact, we saw a concrete example of this in DLS C1 W4 A2. Take a look at how they did the initialization there for the 4 layer network: they actually use a variant of He or Xavier initialization. That is because the simple initialization that they had us build in the previous assignment gives really terrible convergence in that particular case. You can give it a try both ways and see a very concrete example of how the more sophisticated init algorithm can help even in a relatively shallow network.

Topic		Replies	Views
Week 1 - Programming Assignment 1 Improving Deep Neural Networks: Hyperparameter tun	1	601	April 10, 2022
Questions about initialization Improving Deep Neural Networks: Hyperparameter tun	3	668	October 30, 2021
Week 1 Assignment 1 Course 2 - HE initialization Improving Deep Neural Networks: Hyperparameter tun	2	512	July 30, 2021
Week 1, W initialization to large random number, and HE Improving Deep Neural Networks: Hyperparameter tun	2	524	August 31, 2021
DLS Week 1 Initialization: HE Initialization Improving Deep Neural Networks: Hyperparameter tun	2	570	August 29, 2021

W1 Parameter initialization assignment

Related topics