Matrix projection on to vector

Joseph_Rock · September 30, 2024, 7:02pm

i didnt understand why we have to divide the vector by its norm ? what is the mathematical proof of this i couldnt find. What happens if we dont divide ?

Alireza_Saei · September 30, 2024, 7:15pm

Hi @Joseph_Rock

The division is done for scaling purpose and the mathematical reason comes from the definition of a projection.

Without dividing by the norm, the result would be stretched or compressed based on the length of v, distorting the true projection.

If you don’t divide by norm of v, you won’t get the correct magnitude of the projection.

Hope it helps! Feel free to ask if you need further assistance.

paulinpaloalto · September 30, 2024, 8:01pm

This is not a question of “proof”: it’s just a convention. If you want to understand the effect of the projection, it’s simpler and clearer to use unit vectors as the input. There are two effects of the projection:

The direction of the output vector.
The length of the output vector.

If you start with a unit vector, then you know what effect the projection will have on the length of any other input vector just by computing the length of the output vector. If you use a non-unit vector to characterize the projection, then you have to divide by the length of the input vector in order to compute the effect that the transformation has on the length of the output vector. So it’s just a question of where you put the normalization computation.

Joseph_Rock · September 30, 2024, 9:47pm

Uhm i cant find a way that this is correct intuitively i cant imagine front of my eyes actually. But i think i can use the formula of vector projection on to another vector formula which is shown in the image (i comprehend this proof). To satisfy this formula we have dot products (which is matrices column and eigenvector that we want project on to) so the remaining part is v vector divided by v magnitude square, which is our projection matrix. as i said before i couldnt imagine why we need divide by vector norm but i can link in this way is it correct?

Joseph_Rock · October 1, 2024, 3:32pm

I didnt get answer this one

paulinpaloalto · October 1, 2024, 3:57pm

Please realize that the mentors here are just fellow students. We are volunteers, meaning that we do not get paid to do this. That means we do not “work for you” and are not required to give you an immediate answer in all cases. There are also timezone differences to consider here. I am UTC -7 and also had kind of a busy day yesterday in my “real life”.

It would help if you could give us a reference to which lecture contains the original slide you show in your first post above. I would say that the formula you show is doing something different than the second formula you show. The first one is not really what I would call a “projection”. You have a linear transformation, which is expressed by the matrix A. You then apply that linear transformation to the vector v and the question is what does that resulting output vector look like? There are, of course, an infinite number of possible such linear transformations and the goal is for us to have ways to understand and characterize the effect of a particular transformation (mapping). Does applying A increase or decrease the length of the input vector? For that purpose, it is simpler to start with v as a unit vector.

The second expression you show does actually express what I think is the definition of the projection of one vector a onto another vector v. That is a different operation than the first one you showed. The projection takes advantage of this way of expressing the meaning of the dot product of two vectors:

a \cdot v = ||a|| * ||v|| * cos(\theta)

where \theta is the angle between the two vectors. If you substitute that into your formula, you get:

proj_va = ||a|| * cos(\theta) * \displaystyle \frac {v}{||v||}

which would be the orthogonal projection of the vector a onto the direction of the vector v, right?

So I think the fundamental confusion here is what the purpose of your initial formula is. Please give us more information about the context of your question.

Joseph_Rock · October 1, 2024, 5:47pm

Sorry, i wasnt mean to be rude just wanna be sure wasnt ignore. The first image was taken from Linear Algebra course week 4 Dimensionality Reduction and Projection.

I just wanna be super clear about how and why covariance matrix work in what way ? Yea there is already video named PCA - Why It Works but i still dont understood because there wasnt rigorous and exact proof of it. I just wanna source that clearly show me why and how PCA works ? In the lab assignemnts i can see that works properly but i couldnt get the mathematical background of it.

paulinpaloalto · October 1, 2024, 6:25pm

Well, the math behind PCA is non-trivial. How much math background do you have? You will need to have taken at least an undergraduate level course in Linear Algebra (the course that math and physics majors take, not the course that statistics or psych majors take). It is based on Singular Value Decomposition. Here’s the wikipedia page on PCA. Google will find you plenty more articles about that.

Here’s an earlier thread with some links about PCA.

Joseph_Rock · October 1, 2024, 6:42pm

I am currently CS student and i had A with my Linear Algebra course but i could just solve problems but couldnt visuaIize underlying math of them this exactly why im taking this coursera course.I want to be done with this course before going to Calculus for ML and AI but it look like i need to take Calculus and turn back to mathematical foundations of PCA. I thought that i can prove it simply with 3-4 formula but it look like it requires a serious knowledge of calculus… am i thinking the right way?

TMosh · October 1, 2024, 6:55pm

Free advice:

I don’t really think a detailed understanding of PCA is necessary for any practical work.

I use hammers all the time - but I could not begin to tell you how to make one.

Also, quite a lot of the content of this course is only tangentially-related to Machine Learning. A lot of it is just classical math tricks:

You’ll never see Newton’s Method used in machine learning.
You’ll never roll any loaded dice.
You won’t need to use gaussian elimination.

Joseph_Rock · October 1, 2024, 7:07pm

well, tbh why im chasing this to have better intuition about ML/AI , applications of them and underlying math of them. If i better understand these concepts than i think i can have better models , algorithms and even maybe inventing new tech… am i wrong ? Isnt these words said by Andrew Ng in the first lecture of this course ? I would be grateful if you could enlighten me.

TMosh · October 1, 2024, 7:09pm

It’s great that you want to have a detailed understanding. Forge right ahead.

Joseph_Rock · October 1, 2024, 7:11pm

Well im forcing myself to understand PCA before passing in to calculus but it looks like be better if i take calculus and turn back to foundations of PCA what you think ?

TMosh · October 1, 2024, 7:18pm

Calculus and PCA are not related topics. It doesn’t matter which you study first.

paulinpaloalto · October 1, 2024, 7:18pm

Yes, it’s fine to just proceed with the next calculus class and then later to come back and think more about PCA. Just make sure your expectations are realistic: you will not learn any calculus in the next course that will shed any further light on how and why PCA works.

Joseph_Rock · October 1, 2024, 7:42pm

Sir paul, you havent mentioned any idea that my dedication and forcing my self to something. Am i doing correct thing ? im asking this sincerely, your ideas are welcomed

paulinpaloalto · October 1, 2024, 8:14pm

Hi, Joseph.

If your goal is to become a researcher in the ML/DL/AI space and develop new algorithms and improve existing ones, then your approach of wanting to understand the underlying math rather than just accepting it as a given is a good thing overall. It’s just that you have to keep a sense of balance here: you could also make math a career and getting a PhD in math is a lot of work and time commitment. Understanding which math topics are worth spending more time on, so that you get good “return on investment” from the time and effort you spend is important. Please understand that I’m not saying that I’m an expert in any of this or can give you any “rules of thumb” about how to make those decisions in the general case.

But in the particular case of PCA, my “take” is that being able to write out a proof that PCA works is less important than just getting the basic intuition of how and why it works. The fundamental ideas are that they compute the covariance matrix of all the features and then compute the eigenvectors of that transformation. Those will be ordered in decreasing absolute values of the corresponding eigenvectors. What that “eigendecomposition” gives you is a picture of which features have the largest effect. You can then pick a threshold value and eliminate the weaker features below a certain level. They show you in the assignment some examples of how the choice of threshold affects the resolution of the reconstructed images.

Nevermnd · October 1, 2024, 8:14pm

@Joseph_Rock I’ve recently been doing my math classes elsewhere, and I know I picked up this text some time ago and mentioned it to Paul, I finally dug into Gilbert Strang’s ‘Linear Algebra for Everyone’.

I wouldn’t say it is entirely a walk in the park (but it is not anywhere as deep as Strang’s ‘tome’)-- But it walks you through everything you need to know about Linear Algebra as it concerns Data Science/ML, including a section on PCA-- and it is through, rather than cursory.

I even just noticed at the end there is a short section that goes into Neural Nets and Convolutions-- But in math terms.

@TMosh I would not entirely disagree with you you don’t need to know how to ‘make the hammer’-- Yet with the math, I think it depends on what are you learning ML for? interest/hobby/own programming ?

Or are you looking for a job. These days in an interview there’s a pretty good chance they’ll ask you a math question or two.

**I also agree with both of them though that Calculus— isn’t going to help you at all with understanding SVD/PCA.

Joseph_Rock · October 2, 2024, 9:17pm

Well i’ve been thinking on your writings and the term that “return on investment” makes perfect sense to me. As i said before im a last year CS student i’ve lot things to learn and do that have much priority i guess. As much as it tempts to chasing underlying math of terms i’ve to use my sources (time and energy) efficently and use on prior things. If i do master or PhD i think i’ll have greater opportunuties and time to learn this much deep math stuff. I guess learning things that given in this course properly will be more than enough for “at this level”. So i’ve decided just push my self to learn course’s concepts (and just basic proofs lol), improve my problem/algorithm solving skills, and force for the things that i able to learn. if its a deep and dangerous sea for me (only for now) i’ll note it and come back and work on it when i have stronger theoric mathematical background. Thanks for your advices, and sorry if i’ve been difficult student i hope to see you in next courses questions.

Joseph_Rock · October 2, 2024, 9:19pm

Thanks for suggestions, i’ll definitely consider it. I’m just a student that push my limits to have comfort in my rest of life

Topic		Replies	Views
FaceNet Assignment Convolutional Neural Networks coursera-platform	7	553	August 8, 2021
Why dont we perform step 5. in the assignment? Linear Algebra for Machine Learning and Data Sc... week-module-4	2	24	March 25, 2025
Questions on how the projection works Linear Algebra for Machine Learning and Data Sc... week-module-4	2	148	May 31, 2024
Question on Dot product to project data NLP with Classification and Vector Spaces week-module-3	1	350	December 24, 2021
Ex. 6, normalize_rows question Neural Networks and Deep Learning coursera-platform	1	590	June 27, 2021

Matrix projection on to vector

Related topics