Is k means clustering some how related with k nearest neighbour?

I am confused between the two always plz help me resolve this conflict

No they’re quite different (k-means is unsupervised, knn is supervised).

Here’s an article I found on the internet:

K-Means groups examples by their similarity. You don’t know if the groups are “correct”, since there are no labels to provide a truth reference.

KNN labels an example by using the labels of the ‘K’ nearest examples.

Qouting from the link you shared under Kmeans

Then we calculate the distance of each data point from each centroid we created

Isn’t that how knn works, if data points has nearest distance then they are part of class / group?

In maths me take distance between two point as quantity to judge similarity. So working of knn and kmeans is same? Just the difference lies whether there is true label column provided or not.

For example if i take iris dataset, and provide labels also it is knn, but when I remove label and use clustering, will i get similar value of groups (data points groups same as class) ? Maybe i should try it, but what do you think

No, KNN also looks at the labels for the nearest examples.

1 Like

Ohk so both algorithms works on distance calculation formula, in knn the goal is to detect to which class new data label belongs to by finding closest data and finding majority of such data points like this

In the kmeans, the distance is calculated and the centroid of the clusters are adjusted accordingly.

So the principle is euclidian distance but in what manner it is used is what gives difference between two algorithms.