Glorot Initializer for Bias?

Tony_Jiang · January 5, 2022, 7:52pm

Hi, in the week 3 programming assignment we initialize the bias variables using the Glorot Initializer instead of zeros as we have been doing up until this point.

Is there a reason for this? And are there any benefits to using zeros vs Glorot initialization?

paulinpaloalto · January 5, 2022, 10:35pm

This is an interesting point that I missed until you just pointed it out. It turns out that for Symmetry Breaking (which is required as explained on this thread and as we saw in the Initialization exercise in Week 1), it suffices to initialize either the weights or the bias values to be random and the others to be zero. But there is no harm in initializing both the weights and bias values to be random: you’ve still done the required Symmetry Breaking. Then the question is just whether you get better convergence when you use non-zero values for both and whether Glorot initialization is better than Xavier or He or any of the other possibilities. These are “hyperparameters” and the only way to know what works best is to try the various combinations.

It would be interesting to run the experiment here in this exercise: create another version of the initialization that uses tf.zeros for all the bias values and then compare the performance of convergence with the two styles of initialization. Then try some of the other algorithms besides Glorot and see what effect that has. Here’s the menu of possible initialization functions that TF provides.

Thanks for pointing this out and let us know if you try any such experiments and notice anything interesting. Science!

Topic		Replies	Views
[Week 3] Exercise 4 - initialize_parameters Improving Deep Neural Networks: Hyperparameter tun coursera-platform	2	737	April 22, 2021
A little note on parameter initializations (Glorot, Xavier, He) Improving Deep Neural Networks: Hyperparameter tun week-module-1 , coursera-platform	0	43	March 2, 2025
Course 2 Week 3 exercise initialize Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	586	June 1, 2021
C2 Week 3 - Won't pass test with b initialized to zeros Improving Deep Neural Networks: Hyperparameter tun coursera-platform	1	610	April 29, 2021
Course 2 Initialization with zero weights and none zeros bias Improving Deep Neural Networks: Hyperparameter tun coursera-platform	3	571	May 14, 2021

Glorot Initializer for Bias?

Related topics