Question on mathematical intuition behind xavier initialization?

Stephano_Cotsoradis · February 19, 2024, 1:35am

I decided to dive deeper into Xavier Initialization to trying an build a more methematical intuiton behind why the formlula works. In the paper of the creation of the formula it says that to compromise between 1/n_in and 1/n_out it just doe 2/(n_in+n_out) but why does it do this because its not and average or a harmonic mean it just says compromise ?

TMosh · February 19, 2024, 2:41am

This might be a useful read:

rmwkwok · February 19, 2024, 3:22am

If I just look at the equations presented and without other context, equation [12] seems to be the result of adding up equation [10] and equation [11]. You see, equations [10] and [11] imply two different variance values, so a compromise is like to assume a third and common value that “satisfies” both equations, and in doing so, it ends up as equation [12].

Cheers,
Raymond

Stephano_Cotsoradis · February 19, 2024, 6:21pm

But when the layers are diffrent sizes the wont the formula not work to just add up the two equations if n_l and n_l+1 are diffrent ? Or did the researchers just find out the the formula was good enough even when sizes didn’t directly match up ?

rmwkwok · February 20, 2024, 1:00am

You really need to ask the authors if you want to know their intention. We are all just making guesses.

Topic		Replies	Views
Xavier Initilization formula Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	554	June 22, 2021
Parameter initialization question Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	593	February 18, 2023
Week2 Programming Assignment 1 - random weight initialization Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	514	October 20, 2022
Week 1, W initialization to large random number, and HE Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	524	August 31, 2021
C2W1 Weight Initialization for Deep Networks Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	815	May 28, 2021

Question on mathematical intuition behind xavier initialization?

Related topics