Does anybody use conjugate gradients method for training neural networks

AFAIK, conjugate gradients may be much more efficient than the gradient method for optimizing functions with a big number of parameters. Is it used to train neural networks? I’d expect that it may help train faster

It’s used in some optimizers instead of the fixed-rate gradient descent method. Not just for NN’s.