Week 3 Random Initialization

In the lecture, Prof. Andrew suggests that we should randomize the parameters W and b. Then he mentioned that we need to make sure that W parameters not all be 0, but b can be zero. I agree. However, I am wondering why we cannot make all b to be not zero, but make W to be zero. Won’t this also work?

Yes, you can do it either way and still “break symmetry”. But the common practice is to randomly set the W values and zero the b values. My guess is that this gives you faster learning but this is an experimental science: you can try it both ways and see what happens. Well, I guess there are three different methods to compare: random W and zero b, zero W and random b and then both of them random.

Here’s a thread which goes into a bit more detail about Symmetry Breaking and mentions the issue that you point out.

3 Likes

It is also worth pointing out that the initialization algorithm turns out to be an important “hyperparameter”, meaning a design choice that you need to make. There are a large number of different algorithms that have been developed, but no one “silver bullet” solution that works the best in all cases. So doing the experiment suggested above on one particular dataset and model is just one data point in a very large search space.

This is a more advanced topic that Prof Ng does not have time to cover here in Course 1. We are just getting started and there is a lot more to say. He will go into more detail on initialization in Week 1 of Course 2, so please stay tuned for that.

1 Like

Prof. Ng mentioned that W should not all be 0 because, if we are not breaking the symmetry, then the hidden units will be symmetric as they will be computing the same functions. But doesn’t this mean that W should not be symmetric as a starting point? From my understanding it shouldn’t also be all 1 or all 2 as well?

Exactly. That is why we randomly initialize all the W values.

1 Like

Note that I gave a link in one of my earlier replies on this thread to a thread that goes into more detail about symmetry breaking.

1 Like

I checked the link, and your response was very thorough. Thank you for pointing it out to me.