Gradient Descent question

2 Questions:

1)How do you use the initial values for gradient descent?
2)How come the question for gradient descent always lowers “w” with the minus derivative portion?

Thanks in advance.

The initial values are for the weights. Gradient descent starts from there, and proceeds “downhill” to the minimum cost.

Gradient descent is intended to be used with convex cost functions. Those have the characteristic that the gradients are positive if the weights are too low, and are negative if the weights are too high.

Thanks TMosh. How do you select those initial values for w and b?

Generally just set them to all-zeros. It’s as good a set of initial values an any, since we have no clues what the final values might be until training is complete.

1 Like