Could you please guide me on this issue?
Training examples with normalized features can speed up gradient descent compared to unnormalized features.
Increasing the number of training examples can improve the accuracy of prediction of a learning model.
By increasing the number of training examples with normalized features, will the gradient descent be speeding up or slowing down when finding for the minimum point of cost function?
Gradient descent is faster with normalized features because you can use a higher learning rate.
Having more examples always requires more computation, so that means more examples is slower.
Sure. However, blatantly making the choice of a higher learning rate results in a poor model as it fails to converge
Dear Mr Tom Mosher and Mr Moshood Olawale
Thank you so much for your information.
That happens if the learning rate is too high.