i didnt understand why we have to divide the vector by its norm ? what is the mathematical proof of this i couldnt find. What happens if we dont divide ?

Hi @Joseph_Rock

The division is done for scaling purpose and the mathematical reason comes from the definition of a projection.

Without dividing by the norm, the result would be stretched or compressed based on the length of v, distorting the true projection.

If you donāt divide by norm of v, you wonāt get the correct magnitude of the projection.

Hope it helps! Feel free to ask if you need further assistance.

This is not a question of āproofā: itās just a convention. If you want to understand the effect of the projection, itās simpler and clearer to use unit vectors as the input. There are two effects of the projection:

The direction of the output vector.

The length of the output vector.

If you start with a unit vector, then you know what effect the projection will have on the length of any other input vector just by computing the length of the output vector. If you use a non-unit vector to characterize the projection, then you have to divide by the length of the input vector in order to compute the effect that the transformation has on the length of the output vector. So itās just a question of where you put the normalization computation.

Uhm i cant find a way that this is correct intuitively i cant imagine front of my eyes actually. But i think i can use the formula of vector projection on to another vector formula which is shown in the image (i comprehend this proof). To satisfy this formula we have dot products (which is matrices column and eigenvector that we want project on to) so the remaining part is v vector divided by v magnitude square, which is our projection matrix. as i said before i couldnt imagine why we need divide by vector norm but i can link in this way is it correct?

I didnt get answer this one

Please realize that the mentors here are just fellow students. We are volunteers, meaning that we do not get paid to do this. That means we do not āwork for youā and are not required to give you an immediate answer in all cases. There are also timezone differences to consider here. I am UTC -7 and also had kind of a busy day yesterday in my āreal lifeā.

It would help if you could give us a reference to which lecture contains the original slide you show in your first post above. I would say that the formula you show is doing something different than the second formula you show. The first one is not really what I would call a āprojectionā. You have a linear transformation, which is expressed by the matrix A. You then apply that linear transformation to the vector v and the question is what does that resulting output vector look like? There are, of course, an infinite number of possible such linear transformations and the goal is for us to have ways to understand and characterize the effect of a particular transformation (mapping). Does applying A increase or decrease the length of the input vector? For that purpose, it is simpler to start with v as a unit vector.

The second expression you show does actually express what I think is the definition of the projection of one vector a onto another vector v. That is a different operation than the first one you showed. The projection takes advantage of this way of expressing the meaning of the dot product of two vectors:

a \cdot v = ||a|| * ||v|| * cos(\theta)

where \theta is the angle between the two vectors. If you substitute that into your formula, you get:

proj_va = ||a|| * cos(\theta) * \displaystyle \frac {v}{||v||}

which would be the orthogonal projection of the vector a onto the direction of the vector v, right?

So I think the fundamental confusion here is what the purpose of your initial formula is. Please give us more information about the context of your question.

Sorry, i wasnt mean to be rude just wanna be sure wasnt ignore. The first image was taken from Linear Algebra course week 4 Dimensionality Reduction and Projection.

I just wanna be super clear about how and why covariance matrix work in what way ? Yea there is already video named PCA - Why It Works but i still dont understood because there wasnt rigorous and exact proof of it. I just wanna source that clearly show me why and how PCA works ? In the lab assignemnts i can see that works properly but i couldnt get the mathematical background of it.

Well, the math behind PCA is non-trivial. How much math background do you have? You will need to have taken at least an undergraduate level course in Linear Algebra (the course that math and physics majors take, not the course that statistics or psych majors take). It is based on Singular Value Decomposition. Hereās the wikipedia page on PCA. Google will find you plenty more articles about that.

I am currently CS student and i had A with my Linear Algebra course but i could just solve problems but couldnt visuaIize underlying math of them this exactly why im taking this coursera course.I want to be done with this course before going to Calculus for ML and AI but it look like i need to take Calculus and turn back to mathematical foundations of PCA. I thought that i can prove it simply with 3-4 formula but it look like it requires a serious knowledge of calculusā¦ am i thinking the right way?

Free advice:

I donāt really think a detailed understanding of PCA is necessary for any practical work.

I use hammers all the time - but I could not begin to tell you how to make one.

Also, quite a lot of the content of this course is only tangentially-related to Machine Learning. A lot of it is just classical math tricks:

- Youāll never see Newtonās Method used in machine learning.
- Youāll never roll any loaded dice.
- You wonāt need to use gaussian elimination.

well, tbh why im chasing this to have better intuition about ML/AI , applications of them and underlying math of them. If i better understand these concepts than i think i can have better models , algorithms and even maybe inventing new techā¦ am i wrong ? Isnt these words said by Andrew Ng in the first lecture of this course ? I would be grateful if you could enlighten me.

Itās great that you want to have a detailed understanding. Forge right ahead.

Well im forcing myself to understand PCA before passing in to calculus but it looks like be better if i take calculus and turn back to foundations of PCA what you think ?

Calculus and PCA are not related topics. It doesnāt matter which you study first.

Yes, itās fine to just proceed with the next calculus class and then later to come back and think more about PCA. Just make sure your expectations are realistic: you will not learn any calculus in the next course that will shed any further light on how and why PCA works.

Sir paul, you havent mentioned any idea that my dedication and forcing my self to something. Am i doing correct thing ? im asking this sincerely, your ideas are welcomed

Hi, Joseph.

If your goal is to become a researcher in the ML/DL/AI space and develop new algorithms and improve existing ones, then your approach of wanting to understand the underlying math rather than just accepting it as a given is a good thing overall. Itās just that you have to keep a sense of balance here: you could also make math a career and getting a PhD in math is a lot of work and time commitment. Understanding which math topics are worth spending more time on, so that you get good āreturn on investmentā from the time and effort you spend is important. Please understand that Iām not saying that Iām an expert in any of this or can give you any ārules of thumbā about how to make those decisions in the general case.

But in the particular case of PCA, my ātakeā is that being able to write out a proof that PCA works is less important than just getting the basic intuition of how and why it works. The fundamental ideas are that they compute the covariance matrix of all the features and then compute the eigenvectors of that transformation. Those will be ordered in decreasing absolute values of the corresponding eigenvectors. What that āeigendecompositionā gives you is a picture of which features have the largest effect. You can then pick a threshold value and eliminate the weaker features below a certain level. They show you in the assignment some examples of how the choice of threshold affects the resolution of the reconstructed images.

@Joseph_Rock Iāve recently been doing my math classes elsewhere, and I know I picked up this text some time ago and mentioned it to Paul, I finally dug into Gilbert Strangās āLinear Algebra for Everyoneā.

I wouldnāt say it is entirely a walk in the park (but it is not anywhere as deep as Strangās ātomeā)-- But it walks you through everything you need to know about Linear Algebra as it concerns Data Science/ML, including a section on PCA-- and it is through, rather than cursory.

I even just noticed at the end there is a short section that goes into Neural Nets and Convolutions-- But in *math* terms.

@TMosh I would not entirely disagree with you you donāt need to know how to āmake the hammerā-- Yet with the math, I think it depends on what are you learning ML for? interest/hobby/own programming ?

Or are you looking for a job. These days in an interview thereās a *pretty good* chance theyāll ask you a math question or two.

**I also agree with both of them though that Calculusā isnāt going to help you at all with understanding SVD/PCA.

Well iāve been thinking on your writings and the term that āreturn on investmentā makes perfect sense to me. As i said before im a last year CS student iāve lot things to learn and do that have much priority i guess. As much as it tempts to chasing underlying math of terms iāve to use my sources (time and energy) efficently and use on prior things. If i do master or PhD i think iāll have greater opportunuties and time to learn this much deep math stuff. I guess learning things that given in this course properly will be more than enough for āat this levelā. So iāve decided just push my self to learn courseās concepts (and just basic proofs lol), improve my problem/algorithm solving skills, and force for the things that i able to learn. if its a deep and dangerous sea for me (only for now) iāll note it and come back and work on it when i have stronger theoric mathematical background. Thanks for your advices, and sorry if iāve been difficult student i hope to see you in next courses questions.

Thanks for suggestions, iāll definitely consider it. Iām just a student that push my limits to have comfort in my rest of life