Weight Initialization for Deep Networks : week 1

Anbu · June 16, 2021, 2:43pm

Hi Sir,

Random Initialization 2

What does this below statement meaning? Is something value of 2 should we tune here?

At lecture video 5:09mts , If you wish the variance here, this variance parameter could be another thing that you could tune with your hyperparameters. So you could have another parameter that multiplies into this formula and tune that multiplier as part of your hyperparameter search.

jonaslalin · June 17, 2021, 5:40pm

Hello @Anbu,

This is called He initialization, it is 2 times the Xavier initialization and works better for ReLu activation functions. Xavier is a good choice for tanh activation functions.

To read more:

Anbu · June 20, 2021, 7:32am

Here parameters tuning means what sir ? which one going to be tune ?

Value of 2 should we tune here ?

jonaslalin · June 20, 2021, 7:52am

Most likely, you don’t need to tune either He or Xavier initialization. They work great for most problems with relu or tanh-similar activation functions.

Topic		Replies	Views
What is He initialization? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	584	January 2, 2022
Week 1, W initialization to large random number, and HE Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	546	August 31, 2021
Weight initialization Course 2 week 1 Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	585	January 27, 2023
C2W1 Weight Initialization for Deep Networks Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	826	May 28, 2021
Weight Initialisation - random can be better than He? Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	538	September 30, 2021

Weight Initialization for Deep Networks : week 1

Related topics