Sample K group information

Dear all, i am using K-means code (C3_W1) in my dataset. In the dataset, i have 93 samples, each sample has 47 features. After running the code, I get the graph of group. How do i export the information of which sample belonging to which K group?

Thanks a lot.

1 Like

What does your plot represent? The axes aren’t labeled.

1 Like

Hello @Zhenguang_Zhang,

Were you making a plot like this one in the assignment?

I suggest you to check out the code that generates the above plot for understanding -

  1. Why are there three colors (red, green, and blue) in the plot?
  2. What is the meaning of the black lines?
  3. What each axis of the plot represents?
  4. After knowing the answers of the above 3 questions for the assignment’s plot, then consider how you port the code from the assignment’s data to your own data, answer the three questions again for your own plot.

Cheers,
Raymond

1 Like

Thanks a lot, Raymond.Yes, i am using the same code which generated the graph in the assignment.

  1. i set K =3 so that there are 3 different colors.
    2)black line is the trajectory of the closet K point
    3)The axis represents the distance between datapoints.

My aim is to explore the possible clustering of my data. In the data, there are 93 samples, each with 47 features. There are some human lables of the samples. But i would like to inspect the data objectilvely to see whether there are some patterns. The task is similar to run t-SNE on scRNA-seq/flow cytometry data. Do you have some codes for this task?

1 Like

Hello @Zhenguang_Zhang,

It is great that you have some understanding of the code that you have used. Since you have set k = 3, then you get three groups. Do examine the code more closely and you should be able to tell which sample belongs to which group, as this is also a key concept that you should learn in this assignment. Remember there is a group assignment process in K-mean’s algorithm? If you are not sure, maybe it is time for you to visit the lectures again.

I encourage learners to solve their own problem, even though it can be a tough process, and I do not have codes for your problem. I recommend you to google for some ideas on visualization, and think about which idea can be useful for what aspect of your data. You have 47 features, and it is not quite imaginable that you can visualize all 47 of them in one plot. Therefore, you might inspect them two or three at a time, and with some focus on about what you have to deliever to your audience, and use visualization to help you on that.

Cheers,
Raymond

Thank you, Raymond. I found the K vlaue is in the idx slot.