Natural Language Processing with Classification and Vector Spaces

Hi I am trying to do the final lab for week 3 of Natural Language Processing with Classification and Vector Spaces

Assignment: Vector Space Models | Coursera

My problem is with the final question, the compute_pca function

This is my attempt. There is something I am doing wrong about which way round each of the matrices is. The incoming X matrix for the example is of shape (3,10). I am sure that the covariance matrix I need is shape (10, 10). Just calling np.cov(X_demeaned) gives a (3, 3) covariance matrix so I am fairly confident that np.cov(X_demeaned, rowvar=False) is correct. But from then on I don’t have any confidence that the matrices or vectors are the right way round. I have to call eigen_vecs_subset.T to make the final dot product work, but the numbers it produces are wrong, so something somewhere is the wrong way round. I have tried so many different permuations of transposing things, taking columns not rows for subsets, etc, etc, but I haven’t been able to guess it. Can you help? Thank you!

Posting codes from grade cell function is against community guidelines. codes removed in violation of Code of Conduct. Kindly post screenshot of error or your output with expected output. If mentor wants to see your codes, they will ask you.

I instrumented my code for compute_pca to show the dimensions of all the relevant objects and here’s what I see with code that passes the tests:

X.shape (3, 10)
X_demeaned.shape (3, 10)
covariance_matrix.shape (10, 10)
eigen_vals [-7.03941390e-17 -3.60417070e-17 -1.30858621e-17 -8.61317229e-19
  2.07977247e-19  3.78308880e-18  1.81729034e-17  5.06232858e-17
  2.50881048e-01  5.48501886e-01]
idx_sorted [0 1 2 3 4 5 6 7 8 9]
eigen_vecs_subset.shape (10, 2)
X_reduced.shape (3, 2)
Your original matrix was (3, 10) and it became:
[[ 0.43437323  0.49820384]
 [ 0.42077249 -0.50351448]
 [-0.85514571  0.00531064]]

Please compare that to what you are getting and let us know if that sheds any light.

1 Like

OK yes!

First of all posting the eigen_values helped me find that I was demeaning X with the mean of X, not each row of X with the mean of that row

Secondly the problem was the sort of the eigenvalues, where I had to transpose them, sort the transposed matrix, and transpose back

And now it works, thank you

2 Likes