How different initialization of centroids of K-means results in drastic different clusters ? They all share common cost function

In addition, solvers like ADAM [Adaptive Moment Estimation] utilize „Momentum“ of the optimization trajectory to overcome local minima in order to get to the global optimum eventually.


Best
Christian

2 Likes