K-Means Clustering

yusufnzm · January 21, 2023, 6:20pm

Why are there different clusters when we initialize other points to be the cluster separator?
I thought there must be an ideal clustering that the system would converge to. How can these different clusterings be justified?

Christian_Simonis · January 21, 2023, 7:03pm

Hi @yusufnzm,

welcome to the community and thanks for your question!

Why are there different clusters when we initialize other points to be the cluster separator?

For different initialisations you might end up in different local optima. Besides initialisation also randomness of the training processes can contribute to this, see also this thread: How different initialization of centroids of K-means results in drastic different clusters ? They all share common cost function - #4 by Christian_Simonis

A criterion how clustering can be justified or assessed is e.g. (Shannon) entropy-based criteria like Adjusted Mutual Information, see also further clustering criteria here: Performance Metrics in ML - Part 3: Clustering | Towards Data Science

If you are uncertain what a good number of clusters is, you can take a closer look at the elbow method which is often used in practice.

Hope that helps!

Best regards
Christian

yusufnzm · January 21, 2023, 7:33pm

Thank you, man. Very appreciated.

Topic		Replies	Views
How different initialization of centroids of K-means results in drastic different clusters ? They all share common cost function Unsupervised Learning, Recommenders, Reinforcement week-1	14	844	November 28, 2022
Course 3 week 1 : initializing K means vs choosing the right K Unsupervised Learning, Recommenders, Reinforcement week-1	3	479	April 12, 2023
Understanding K-mean clusters Unsupervised Learning, Recommenders, Reinforcement week-1	4	535	January 7, 2023
How exactly does k means know that a cluster centroid is closer to these set of data sets? Unsupervised Learning, Recommenders, Reinforcement week-1	4	507	February 19, 2023
Why not sort a dataset and pick initial centroids at spaced intervals? Unsupervised Learning, Recommenders, Reinforcement week-1	2	16	July 12, 2024

K-Means Clustering

Related topics