Week-4: Explanation on eigen vectors and Covariance matrix

If covariance matrix tells us how spread out the data is in each dimension, what does it even mean to use it as a linear transformation?
Also In final week’s programming assignment, when we plotted eigen vectors as image for the cat dataset, the vectors themselves looked like cat. Why was that the case? If Eigen vectors tell us the direction in which variances for our dataset spread the most, then how did the eigen vector relate to the features of dataset itself?

hi @koulritesh98

Using a covariance matrix as a linear transformation means viewing it as a geometric operation that stretches, scales, rotating data’s shape, and not just describing its spread.

it also transforms data points by multiplying them by the matrix, effectively mapping them into a new space where the spread (variances) and correlations (covariances) define the spatial distortion, particularly revealing directions of maximum variance (eigenvectors) for dimensionality reduction (like PCA)

1 Like

in PCA(principal component analysis) for dimensionality reduction and feature extraction, the first eigenvector points in the direction where the data (pixel values) spread the most (maximum variance). This direction usually captures the most significant structure like the general shape of a cat’s head, ears or body because these features are most prominent and vary consistently.

So the eigenvector aren’t features points but variation in the pixel intensity that resembles cat shape form as they follow the direction of variance where the data(pixel values) spread the most (maximum variance)

Hello.
Thank you for your response. I have another question. Shouldn’t the general shape of cat be captured by the mean vector instead?

@koulritesh98

A mean vector represents the average position or center of a set of data points where asgenvectors describe the principal directions (variance/spread) of that data.

The key difference is transformation versus central tendency where eigenvectors are about how a space transforms and mean vector is about the typical location of points in that space, often calculated before finding eigenvectors (like in PCA)

Eigenvector reveals the inherent axes of stretching, shrinking or the “shape” of the transformation (axis of rotation) where as mean vector smmarizes the “center” of the data.

The general form of a cat face is captured by a combination of both mean vector and the eigenvectors (specifically in methods like PCA)

**Mean Vector (The Average Shape)*- The mean vector represents the average pixel intensity of all cat images. If you visualize the mean vector as an image, it looks like a blurry, ghostly, “average cat”. It captures the most fundamental shared shape of all cats (for example ears at the top, body in the middle).

Eigenvectors (The Principal Components or Variation)-eigenvectors which is derived from the covariance matrix represents the directions of maximum variance in the data which I already mentioned in my earlier response.

  • First Eigenvector actually captures the biggest difference between cats (for example a cat looking left or right, or a furry cat versus a solid-colored cat).

  • Subsequent Eigenvectors then capture finer details like ear shape, eye size or specific markings.

  • Eigenvalues (Importance) its corresponding eigenvalues measures how much variance (or information) each eigenvector captures. A large eigenvalue means the corresponding eigenvector is very important for describing the data’s structure.

Why both mean vector and eigenvector is needed
The general form isnt just about one picture, it’s average structure (mean) plus the main ways that structure varies (eigenvectors) .
When combined, they allow for dimensionality reduction for example “Eigenfaces” used for facial recognition where a new face is described as a combination of the mean face and a few top eigenfaces.

So if you ask me, mean vector determine how in average cat looks like and eignvector determines how different a cat looks from the average look of cat.

Regards
DP

@Deepti_Prasad I see, so let us say, I have a dataset of 50 cats with 2 features, they are either black or white and either very furry or normal. So if 30 of them are black and 20 of them are white, with 20 black and 10 white cats being very furry, and remaining 20 are normal, the average vector would represent slightly furry dark grayish cat where as first eigen vector would represent a black and furry cat and the second eigen vector would represent let us say white furry cat? In conclusion where the average vector would represent general shape of all cats, an eigen vector would represent the most common repeated feature of them all?

don’t define or assume vectors based on cat features :rofl::joy:

here vectors are being represented as data points which may or may not be cat, but detection of the data points from mean vector gives a blurred image of cat where as eigenvector gives more profound features of cat(cat being a data point)

Usually the first eigenvector has the highest eigenvalue(not necessarily) so gives a more specific marking based on spread of variance(being able to detect other features in data points) and subsequently next value of eigenvector captures other features based on data point matching with the eigenvalue.

Notice here I mention general form of data is combination of mean vector and eigenvector, and not separately defined where one is able to capture an average look out of how a cat looks, eigenvector captures high to low eigenvalue specific eigenvector which will capture the data variance (if cat is looking away, or has distinct color, or a furry cat)

Haha, sorry I just wanted to extend on the cat dataset as an example :sweat_smile:. Yes, and what I wanted to say is how the average vector and eigen vector would differ based on what I understood. Where an average vector just tells us that how our data looks like in general, the eigen vector stretching in the direction of high variance tells us that in this direction you will find a pattern that repeats the most in our dataset.

1 Like

yes perfect :upside_down_face: , not repeated presentation of data but highest spread of data dispersion with respect to dimensional reduction

1 Like

Thanks again :sweat_smile:

1 Like