Hi there.

For a long time, I’ve been wondering how do we decide the initial value of w (weight) and b (bias) before applying the gradient descent algorithm. Currently I am at week 2 of ML Specialization course 1 and I have observed that before applying gradient descent there is some initial value determined before.

Is it considered randomly or there is some strategy that goes behind it? If I am wrong somewhere kindly correct me