Weight Initialization for Deep Networks : week 1

Hi Sir,

Random Initialization 2

What does this below statement meaning? Is something value of 2 should we tune here?

At lecture video 5:09mts , If you wish the variance here, this variance parameter could be another thing that you could tune with your hyperparameters. So you could have another parameter that multiplies into this formula and tune that multiplier as part of your hyperparameter search.

Hello @Anbu,

This is called He initialization, it is 2 times the Xavier initialization and works better for ReLu activation functions. Xavier is a good choice for tanh activation functions.

To read more:

1 Like

Here parameters tuning means what sir ? which one going to be tune ?

Value of 2 should we tune here ?

Most likely, you don’t need to tune either He or Xavier initialization. They work great for most problems with relu or tanh-similar activation functions.