K-means plot for image classification project

Katherine_Moss · June 25, 2025, 12:14am

I added this K-means plot to the Georgia Project. The cephalexin and phenylglycine images used for the plot were part of a binary classification project – but, after some testing, they fell neatly into four clusters, as seen below.

For more on the OpenCrystalData dataset, Crystallization impurity detection | Kaggle

For more on the Georgia Project, including this K-means plot, see GitHub - KatherineMossDeveloper/The-Georgia-Project: Study of the cropped images in the OpenCrystalData dataset on Kaggle

Cheers,

Katherine

Faizan_481 · July 8, 2025, 5:43am

That’s a really interesting observation! Even though the original task was binary classification, the emergence of four distinct clusters through K-means suggests there might be hidden patterns or subgroups within the cephalexin and phenylglycine images possibly related to variations in structure, concentration, or experimental conditions.

Katherine_Moss · July 8, 2025, 10:22am

I decided to do K-means on the images as an academic exercise, but then I saw, when doing 4 centroids, that the phenylglycine divided itself into 3 groups, while the cephalexin did not. That teaches me that being ‘academic’ can lead to understanding.

As you point out, I am curious about the differences in things like experimental conditions; however, the only information I have from the dataset is the date and time stamp in the image files. The first PNG image file that I extracted information from said that it was create shortly before midnight, so perhaps the experimental condition was that the operator had a lot of coffee.

Cheers,

Katherine

Topic		Replies	Views
K-Means for Image color compression Unsupervised Learning, Recommenders, Reinforcement week-module-1	1	541	September 18, 2022
Check out crystallization images on Kaggle's OpenCrystalData dataset AI Discussions data-centric	0	52	May 30, 2025
Sample K group information Unsupervised Learning, Recommenders, Reinforcement week-module-1	5	426	July 12, 2023
Image color compression using Kmeans algorithm Unsupervised Learning, Recommenders, Reinforcement week-module-1	1	503	October 31, 2022
K - means clustering Unsupervised Learning, Recommenders, Reinforcement week-module-3	1	262	January 29, 2024

K-means plot for image classification project

Related topics