Why doesn't Newton's Method overshoot?

I’d like to confirm why Newton’s Method does not overshoot reaching a zero of a function. Adding the formula below:


I believe Newton’s Method does not overshoot because f(x) gradually approaches 0, the subtracted term thus becomes a 0, and thus xn+1 starts to become equivalent to xn. How correct is that analysis and are there other ways that we should consider why Newton’s Method does not overshoot?

Where does it say that Newton’s Method never overshoots? Mind you, I have not taken this course, so perhaps I am “out of bounds” here, since I don’t know what is actually said in the lectures. But speaking as someone who actually was a math major “back in the day”, it simply can’t be the case that it always converges for an arbitrary function and arbitrary choice of x_0. Functions can be pretty complicated and still be differentiable. If your initial guess for x_0 is unlucky, it could take you off into the weeds.


Ah, I might be wrong then. I don’t believe the course material addressed the points of overshooting or non-convergence, so I might have incorrectly assumed that those are not issues or maybe I missed some material.

Can anyone help to clarify this? Specifically:

  • Is there material that I missed?
  • How are overshooting or non-convergence for Newton’s Method handled in practice?

Hi @farmerinatechstack

in addition to @paulinpaloalto‘s excellent reply:


We need to distinguish between two things:

  • overshooting: I think this is nothing bad in general. (It can be bad in control theory of course, but during an optimization it is not in general). So it’s fine if f(2) = 0, newton method goes iteratively from x_1 = 3x_2 =1.8 x_3 =2. No problem overshooting 2 here. Or see also the graphical example above.
  • convergence: this is rather the issue, quoting the previous source:

Newton’s method is only guaranteed to converge if certain conditions are satisfied.

e.g. let’s take the assumption of discontinuous derivative. If not fulfilled, a permanent overshooting can lead to significant convergence issues which really is a problem to solve the optimization problem. You can find these assumptions within this source as well as such a concrete example for overshooting + convergence problems.

Hope that helps!

Best regards

One guess I have is that we could limit the number of iterations and check if the change in the approximation is becoming lower than some threshold.

Some personal feedback: of course it depends on the underlying function, but very often I have had quite good experience with regula falsi - also known as Pegasus method. In my experience it is more robust than the newton method in many cases!


Big Plus is also that Pegasus method has some guarantees for conversion!

I think trying out several initial points or even optimising them with a gradient-free Baysian approach can help, see also this thread: Why there are so many optimizer algorithms? - #2 by Christian_Simonis

More concretely, you could also treat the initial value in your newtons method as a hyperparameter which you could optimize in a Baysian way. Truth is however: often this would be too complicated and a simple grid search does the trick!

In the end I believe there is no silver bullet as method for numerical optimization, but it definetely helps to have a nice toolkit for numerical optimization.

PS: Especially for large linear systems also the conjugate gradient method (CGM) is also powerful even though I have to admit since I left university, I did not work a lot with it.

Best regards

1 Like