C1W4_Assignment

2.7 Explained variance

Can anyone help me explain how is the 95% red line corresponding with 35 principal components in the image below?

I know it is a stupid question, but I can’t see how the graph explain the statement below it.

Thank you for your time.

Hi @Q_V_2

To find out how many principal components are needed to explain 95% of the variance, find where the cumulative explained variance curve intersects the 0.95 red line.

In this case, the intersection occurs at around 55 principal components. However, make sure your plotting values are correct! It’s possible that the expected plot intersection should occur at 35 principal components.

Hope it helps! Feel free to ask if you need further assistance.

I see. However, the plot was automatically graphed by editor and I didn’t change any of those, that why I was confused about how it intersect at 55 instead 35 ( like the statement ).

But, thank you for your information. It is helpful.

I think this is a tricky concept to understand, but the way I like to think about it is imagine putting on a headset:

And maybe it is not the ‘perfect’ headset, so there is some crackle or feedback.

But your intention is what ‘information’ can I obtain from this ‘signal’-- I mean, okay, in rare cases we are ‘sure’, so we just hear a dull tone. But most situations in the real world are not like this.

So have to deal with this difficult balance between what is noise and what is information ?

If you have *no variance*, there probably isn’t any information.

But to go back to the headset analogy, I mean our minds are pretty good at picking out a simple message or song, even with static (though we can be ‘tricked’-- JPEG, MP3 both good examples, we don’t notice *everything*).

So what I would say is PCA is like doing MP3-- Just this time with your data.