Description:
I have a few questions regarding the graded assignment. Why do we need to sort eigen_values since it’s not being used? The indices from np.argsort(eigen_vals) is directly applied to eigen_vecs matrix, which then is used to transform input matrix X. Why do we need to sort eigen_values as well?
The point of the PCA algorithm is that you want to reduce the dimensions by removing the dimensions that are the least meaningful. Think about what the eigenvalues and eigenvectors mean: the eigenvectors form a basis of the transformation and each one is a vector in the direction of which the transformation is \lambda_i * e_i. So that gives you the information you need: the larger the magnitude of the eigenvalue, the more meaningful is that dimension in the transformation. So you want to remove the dimensions starting with the smallest eigenvalues. I’ve never watched the lectures in NLP C3, but what I’m saying here is what I learned from Prof Ng when he discussed PCA in the original Stanford Machine Learning course. I would hope they would mention that in the lectures here.
Update: sorry, I probably missed the point of your question on the first pass. Yes, the argsort will accomplish what you really need, so it’s not clear why they would also care about having the eigenvalues themselves sorted as a separate thing.
Moreover, the according to the docs , eigh returns the “eigenvalues in ascending order, each repeated according to its multiplicity.” So there’s no need for sorting at all, we can just reverse their (and the corresponding eigenvectors’) order
Also, the last instruction to multiply the transposed EVs and zero-mean data and then transpose again seems a bit clumsy… (A^T * B^T)^T = (B^T^T * A^T^T) = B*A
That’s a good point: just reversing the order of the result would get us what we need in a more efficient way.
It’s also a well known theorem that:
(A \cdot B)^T = B^T \cdot A^T
so you’re right that they could have expressed that more simply. I’ll take a look at the git issues on this course and file another one with your suggestions.
Well, it’s slightly more subtle than that: what they really need is the indices of the elements in descending order. So what they are getting with the np.argsort is the conversion to the list of indices, which they can then reverse and then use to index the eigenvector matrix to select the corresponding vectors. Well, they could have just used range(len(eigen_vals)) as an array and then reversed the order of that. But maybe that would have required more explanation and it’s not worth the saved sort. Note that the sort operation will be very cheap, because (as you pointed out) the values are already in the desired order.
But I will file an enhancement request making the point about the needless transposes and the point that the OP of this thread makes about the sorted eigenvalues array being unnecessary.