GLM (Generalized Linear Model) is typically optimized through maximum likelihood algorithm in the implementations from many packages.

Does gradient descent have advantages over maximum likelihood?

One potential shortcoming for gradient descent is that it might end up with local minimum, but not global minimum.

I am interested in your thoughts around which algorithm is the better one to use for GLM.