Clustering and PCA

Can clustering be done on PCA data, or vice versa?

Hi @ajeancharles,

a PCA is used in general for dimensionality reduction, see also this thread: How to evaluate-visualize clusters derived through PCA - #2 by Christian_Simonis

The PCA could often be very helpful when it comes to clustering since it simplifies the data (less features) but keeps the relevant information (certain % of variance of data). Therefore, the PCA can be used to prepare the data (e.g. reduce redundancy of data in the course of dimensionality reduction) before modelling! Then, after the transformation into the lower dimensional PC space (so to say: after the PCA) you usually wanna apply a clustering method / model based on your problem, see also: 2.3. Clustering — scikit-learn 1.3.0 documentation

In summary: often a PCA can help when you tackle a clustering problem as a preparation step before the clustering (model) step, see also this repo. Hope that helps!

Best regards

1 Like

Great! I hope to take advantage of this possibility.

Thank you!

1 Like