Here is a thread which discusses the zero initialization versus “symmetry breaking” in more detail. It turns out that in the case of Logistic Regression, zero initialization works, but any kind of symmetric initialization (zero or some other constant) does not work in the Neural Network case.