  1. Training examples with normalized features can speed up gradient descent compared to unnormalized features.

  2. Increasing the number of training examples can improve the accuracy of prediction of a learning model.

By increasing the number of training examples with normalized features, will the gradient descent be speeding up or slowing down when finding for the minimum point of cost function?

Gradient descent is faster with normalized features because you can use a higher learning rate.

Having more examples always requires more computation, so that means more examples is slower.

That happens if the learning rate is too high.

