Source of toy_dataset.csv?

Just wondering, can you comment on what the toy_dataset.csv from the last practice lab C3_W2_Lab01_PCA_Visualization_Examples (#Using PCA in Exploratory Data Analysis) is, or how it was generated?

Thanks!

Hi @Inntr8,
It looks like your question got overlooked somehow. I don’t know anything about the source of the toy_dataset.csv, but I’ll see if I can find someone who does know.

I suspect the numbers are probably randomly generated, but it is interesting that they cluster so well…

1 Like

Hello @Inntr8 and @Wendy

The dataset was generated using the function make_classification from scikit-learn library. They indeed cluster well because of the way they were generated.

Thanks,
Lucas

1 Like

Nice! Thanks, @lucas.coutinho!

Thank you very much @lucas.coutinho and @Wendy !!