C4W1 final assignment explanation after exercise 4


Hello, Can someone explain more about why there are most 55 eigenvalues will be non-zero? I’m guessing it has to do with the fact there are 55 examples, but mathematically I’m confused in simple terms; does the covariance matrix still have up to 4096 eigenvalues/eigenvectors, but the remanining 4041 are zero?

I think it means that the majority of the variance of the dataset is expressed in the first 55 eigenvectors. So the remainder can be ignored without a significant loss of information.

1 Like

Thanks @TMosh ; a further follow up down below

  1. a. In the image of cats, it is showing the first 16 principal components (I believe the first cat image having the largest eigenvalue, and decreasing among the remaining 15 images): does that mean the first image is capturing more variance along its component axes than the ones after it?

2.b. If so, does that imply more information is encoded in the earlier cat images? Should we expect a decrease in quality? To me I don’t notice any downward trend in quality/information just from the visible eye

2.a. My understanding is we’ve taken a cat image (that could be represented in 4096 pixels); and reduced the dimensionality down; how many dimensions is each cat now represented by? this would be represented by teh dimensions of the respective eigenvector, correct?

2.b. We did this by finding the PCAxes (in this case the 55 with greatest eigenvalues/variance); and representing each cat image as the dot product of eigenvalue and corresponding eigenvector. Since the cat images are in decreasing order of eignvalue, should we expect less variance, and thus a less clear picture of the cat? I don’t see a major difference among them in terms of quality from 1-16

Sorry, I don’t understand your message.
Most of the text is in “strikeout” font.

sorry reframed question above. thanks

2A. Now there are 55 features.

2B. If you only remove the non-essential features, you haven’t lost any useful data. So the images should be nearly identical.

If you discard too many features, then you will lose data, and you’ll be able to see the degradation in the images.

2 Likes