In the video **PCA Algorithm** of PCA section, Prof. Andrew talked about arrow vector of length 1 pointing to the direction of z axis.

So my question is how this arrow vector is obtained? and why we are taking dot product to project it on z-axis?

In the video **PCA Algorithm** of PCA section, Prof. Andrew talked about arrow vector of length 1 pointing to the direction of z axis.

So my question is how this arrow vector is obtained? and why we are taking dot product to project it on z-axis?

The idea of PCA in a nutshell is to reduce the number of dimensions required to describe a point by approximating it as a projection onto a lower-dimensional space.

In this case, the point is located at (2, 3), which requires two dimensions (x1-axis and x2-axis) to describe, and we are trying to approximate it by projecting it onto a one-dimensional space (a line) described by the unit vector z. The direction of the vector z, I believe, was chosen arbitrarily to illustrate an example.

Taking the dot product of the point [2, 3] and the unit vector [0.71, 0.71] gives you the 3.55 which is the distance from the origin on the line described by the vector z. Now, instead of requiring two values (2, 3) to describe the point on the x1,x2 plane, we can use just one value (3.55) to describe the point on the line indicated by z.

Incidentally, the unit vector [0.71, 0.71] is actually [1/sqrt(2), 1/sqrt(2)] rounded to two decimal places, so the dot product is actually 2.5×sqrt(2) which is more like 3.5355. And if you multiply z by this dot product, you would get [2.5, 2.5], which is the actual coordinates of the projection point that is used to approximate the original point [2, 3].

Hope this helps.

It’s actually obtained by solving an eigenvalue problem. Here more info can be found on the steps involved as well as some background info on eigenvectors etc.

Usually we want to go for dimensionality reduction of the feature space to get a better ratio of data to dimensions (which can often help to mitigate overfitting when dealing with limited amount of data). We can get rid of redundant information in our features by doing this transformation to a smaller space which is spanned by the principal components (a subset of the eigenvectors of our previously mentioned problem), see also this thread: Does embedding projector use dimensional reduction? - #4 by Christian_Simonis

Here some exemplary code which you can use to play around with and e.g. check how much information is explained by which of the principal components:

(As you see, the last PCs do not provide too much value here information-wise…)

Best regards

Christian

1 Like

not so sure but I feel you could think of z as span of this vector [0.71 0.71] and use the projection formula x*u/u*u u here to get the projection point, since you could consider [0.71 0.71] as a basis of z. And because norm of [0.71 0.71] is 1, so actually you can simplify the projection formula to be x*u u, and the projection point is 3.55 [0.71 0.71]. since the [0.71 0.71] has length of 1, so the distance is literally 3.55.