Matrix projection on to vector


i didnt understand why we have to divide the vector by its norm ? what is the mathematical proof of this i couldnt find. What happens if we dont divide ?

Hi @Joseph_Rock

The division is done for scaling purpose and the mathematical reason comes from the definition of a projection.

Without dividing by the norm, the result would be stretched or compressed based on the length of v, distorting the true projection.

If you donā€™t divide by norm of v, you wonā€™t get the correct magnitude of the projection.

Hope it helps! Feel free to ask if you need further assistance.

1 Like

This is not a question of ā€œproofā€: itā€™s just a convention. If you want to understand the effect of the projection, itā€™s simpler and clearer to use unit vectors as the input. There are two effects of the projection:

The direction of the output vector.
The length of the output vector.

If you start with a unit vector, then you know what effect the projection will have on the length of any other input vector just by computing the length of the output vector. If you use a non-unit vector to characterize the projection, then you have to divide by the length of the input vector in order to compute the effect that the transformation has on the length of the output vector. So itā€™s just a question of where you put the normalization computation.

Uhm i cant find a way that this is correct intuitively i cant imagine front of my eyes actually. But i think i can use the formula of vector projection on to another vector formula which is shown in the image (i comprehend this proof). To satisfy this formula we have dot products (which is matrices column and eigenvector that we want project on to) so the remaining part is v vector divided by v magnitude square, which is our projection matrix. as i said before i couldnt imagine why we need divide by vector norm but i can link in this way is it correct?
image

I didnt get answer this one :confused:

Please realize that the mentors here are just fellow students. We are volunteers, meaning that we do not get paid to do this. That means we do not ā€œwork for youā€ and are not required to give you an immediate answer in all cases. There are also timezone differences to consider here. I am UTC -7 and also had kind of a busy day yesterday in my ā€œreal lifeā€. :nerd_face:

It would help if you could give us a reference to which lecture contains the original slide you show in your first post above. I would say that the formula you show is doing something different than the second formula you show. The first one is not really what I would call a ā€œprojectionā€. You have a linear transformation, which is expressed by the matrix A. You then apply that linear transformation to the vector v and the question is what does that resulting output vector look like? There are, of course, an infinite number of possible such linear transformations and the goal is for us to have ways to understand and characterize the effect of a particular transformation (mapping). Does applying A increase or decrease the length of the input vector? For that purpose, it is simpler to start with v as a unit vector.

The second expression you show does actually express what I think is the definition of the projection of one vector a onto another vector v. That is a different operation than the first one you showed. The projection takes advantage of this way of expressing the meaning of the dot product of two vectors:

a \cdot v = ||a|| * ||v|| * cos(\theta)

where \theta is the angle between the two vectors. If you substitute that into your formula, you get:

proj_va = ||a|| * cos(\theta) * \displaystyle \frac {v}{||v||}

which would be the orthogonal projection of the vector a onto the direction of the vector v, right?

So I think the fundamental confusion here is what the purpose of your initial formula is. Please give us more information about the context of your question.

Sorry, i wasnt mean to be rude :slight_smile: just wanna be sure wasnt ignore. The first image was taken from Linear Algebra course week 4 Dimensionality Reduction and Projection.

I just wanna be super clear about how and why covariance matrix work in what way ? Yea there is already video named PCA - Why It Works but i still dont understood because there wasnt rigorous and exact proof of it. I just wanna source that clearly show me why and how PCA works ? In the lab assignemnts i can see that works properly but i couldnt get the mathematical background of it.

Well, the math behind PCA is non-trivial. How much math background do you have? You will need to have taken at least an undergraduate level course in Linear Algebra (the course that math and physics majors take, not the course that statistics or psych majors take). It is based on Singular Value Decomposition. Hereā€™s the wikipedia page on PCA. Google will find you plenty more articles about that.

Hereā€™s an earlier thread with some links about PCA.

I am currently CS student and i had A with my Linear Algebra course but i could just solve problems but couldnt visuaIize underlying math of them this exactly why im taking this coursera course.I want to be done with this course before going to Calculus for ML and AI but it look like i need to take Calculus and turn back to mathematical foundations of PCA. I thought that i can prove it simply with 3-4 formula but it look like it requires a serious knowledge of calculusā€¦ am i thinking the right way?

Free advice:

I donā€™t really think a detailed understanding of PCA is necessary for any practical work.

I use hammers all the time - but I could not begin to tell you how to make one.

Also, quite a lot of the content of this course is only tangentially-related to Machine Learning. A lot of it is just classical math tricks:

  • Youā€™ll never see Newtonā€™s Method used in machine learning.
  • Youā€™ll never roll any loaded dice.
  • You wonā€™t need to use gaussian elimination.
1 Like

well, tbh why im chasing this to have better intuition about ML/AI , applications of them and underlying math of them. If i better understand these concepts than i think i can have better models , algorithms and even maybe inventing new techā€¦ am i wrong ? Isnt these words said by Andrew Ng in the first lecture of this course ? I would be grateful if you could enlighten me.

Itā€™s great that you want to have a detailed understanding. Forge right ahead.

Well im forcing myself to understand PCA before passing in to calculus but it looks like be better if i take calculus and turn back to foundations of PCA what you think ?

Calculus and PCA are not related topics. It doesnā€™t matter which you study first.

Yes, itā€™s fine to just proceed with the next calculus class and then later to come back and think more about PCA. Just make sure your expectations are realistic: you will not learn any calculus in the next course that will shed any further light on how and why PCA works.

1 Like

Sir paul, you havent mentioned any idea that my dedication and forcing my self to something. Am i doing correct thing ? im asking this sincerely, your ideas are welcomed

Hi, Joseph.

If your goal is to become a researcher in the ML/DL/AI space and develop new algorithms and improve existing ones, then your approach of wanting to understand the underlying math rather than just accepting it as a given is a good thing overall. Itā€™s just that you have to keep a sense of balance here: you could also make math a career and getting a PhD in math is a lot of work and time commitment. Understanding which math topics are worth spending more time on, so that you get good ā€œreturn on investmentā€ from the time and effort you spend is important. Please understand that Iā€™m not saying that Iā€™m an expert in any of this or can give you any ā€œrules of thumbā€ about how to make those decisions in the general case.

But in the particular case of PCA, my ā€œtakeā€ is that being able to write out a proof that PCA works is less important than just getting the basic intuition of how and why it works. The fundamental ideas are that they compute the covariance matrix of all the features and then compute the eigenvectors of that transformation. Those will be ordered in decreasing absolute values of the corresponding eigenvectors. What that ā€œeigendecompositionā€ gives you is a picture of which features have the largest effect. You can then pick a threshold value and eliminate the weaker features below a certain level. They show you in the assignment some examples of how the choice of threshold affects the resolution of the reconstructed images.

1 Like

@Joseph_Rock Iā€™ve recently been doing my math classes elsewhere, and I know I picked up this text some time ago and mentioned it to Paul, I finally dug into Gilbert Strangā€™s ā€˜Linear Algebra for Everyoneā€™.

I wouldnā€™t say it is entirely a walk in the park (but it is not anywhere as deep as Strangā€™s ā€˜tomeā€™)-- But it walks you through everything you need to know about Linear Algebra as it concerns Data Science/ML, including a section on PCA-- and it is through, rather than cursory.

I even just noticed at the end there is a short section that goes into Neural Nets and Convolutions-- But in math terms.

@TMosh I would not entirely disagree with you you donā€™t need to know how to ā€˜make the hammerā€™-- Yet with the math, I think it depends on what are you learning ML for? interest/hobby/own programming ?

Or are you looking for a job. These days in an interview thereā€™s a pretty good chance theyā€™ll ask you a math question or two.

**I also agree with both of them though that Calculusā€” isnā€™t going to help you at all with understanding SVD/PCA.

1 Like

Well iā€™ve been thinking on your writings and the term that ā€œreturn on investmentā€ makes perfect sense to me. As i said before im a last year CS student iā€™ve lot things to learn and do that have much priority i guess. As much as it tempts to chasing underlying math of terms iā€™ve to use my sources (time and energy) efficently and use on prior things. If i do master or PhD i think iā€™ll have greater opportunuties and time to learn this much deep math stuff. I guess learning things that given in this course properly will be more than enough for ā€œat this levelā€. So iā€™ve decided just push my self to learn courseā€™s concepts (and just basic proofs lol), improve my problem/algorithm solving skills, and force for the things that i able to learn. if its a deep and dangerous sea for me (only for now) iā€™ll note it and come back and work on it when i have stronger theoric mathematical background. Thanks for your advices, and sorry if iā€™ve been difficult student :slight_smile: i hope to see you in next courses questions.

2 Likes

Thanks for suggestions, iā€™ll definitely consider it. Iā€™m just a student that push my limits to have comfort in my rest of life :slight_smile: