Momentum vortex

A question regarding momentum optimization.
Isn’t there a risk while using momentum to enter a scenario where the gradient points to the right direction i.e. the minimum but the momentum pulls you to an orthogonal direction thus causing us to turn around the desired point like a vortex for many steps or even forever?


Yes, it could spiral around the minimum if Beta is too large.

Here is a nice simulator you can play with. You can download the source code and make any adjustments you want. E.g., Momentum with Beta set to 0.99:


