Why not always use Adam optimizer

In addition to 2)
If Adam does not converge well, AMSGrad might be worth a look, see also:
https://johnchenresearch.github.io/demon/

Here also some other algorithms are explained like QHM (Quasi-Hyperbolic Momentum) which decouples the momentum term from the current gradient when updating the weights which can also be beneficial!

Best regards
Christian