C1W3 Assignment, PCA

good afternoon!

I do everything as in the prompts, but still two tests do not pass in the last Exercise 5 - compute_pca:

(10, 10)
Your original matrix was (3, 10) and it became:
[[-0.1697529 -0.09637353]
[ 0.29692971 0.18346481]
[-0.12717681 -0.08709128]]
In [24]:

Test your function

w3_unittest.test_compute_pca(compute_pca)
(10, 10)
Wrong accuracy output.
Expected: [[ 0.43437323 0.49820384]
[ 0.42077249 -0.50351448]
[-0.85514571 0.00531064]].
Got: [[ 0.04995117 -0.02373717]
[-0.1846061 -0.23921252]
[ 0.13465493 0.26294969]].
(15, 15)
Wrong accuracy output.
Expected: [[-0.32462796 0.01881248 -0.51389463]
[-0.36781354 0.88364184 0.05985815]
[-0.75767901 -0.69452194 0.12223214]
[ 1.01698298 -0.17990871 -0.33555475]
[ 0.43313753 -0.02802368 0.66735909]].
Got: [[-0.2531877 0.0008998 -0.1054543 ]
[ 0.30140213 0.26223506 0.29022403]
[-0.61222074 -0.48952961 -0.28125315]
[ 0.3243839 0.30008201 -0.22384007]
[ 0.23962241 -0.07368726 0.32032349]].
4 Tests passed
2 Tests failed

Hi @Igor_Pereverzev

It’s hard to tell where you went wrong just from this message.

These checkpoints could help you find the mistake, (variable name, shape, values):

X_demeaned
 (3, 10) 
 [[-1.28631691e-01 -7.09440297e-02 -1.72549226e-01 -3.21924969e-01
  -2.03421655e-01 -4.60132328e-01 -4.32761970e-02  3.11256141e-02
   1.61105863e-01  4.63158497e-04]
 [-1.26459182e-01 -1.06049023e-01  3.17886488e-02  2.53859895e-01
  -3.22789952e-01  1.17996587e-01  1.87768394e-01  2.44254716e-01
  -9.52746722e-02 -3.40252086e-01]
 [ 2.55090873e-01  1.76993053e-01  1.40760577e-01  6.80650741e-02
   5.26211607e-01  3.42135741e-01 -1.44492197e-01 -2.75380330e-01
  -6.58311912e-02  3.39788928e-01]] 

covariance_matrix
 (10, 10) 
 [[ 4.88046950e-02  3.38429177e-02  2.70410354e-02  1.33348089e-02
   1.00609001e-01  6.57707762e-02 -2.75184938e-02 -5.25695002e-02
  -1.27339493e-02  6.48227388e-02]
 [ 3.38429177e-02  2.38029956e-02  1.68919133e-02  3.98205302e-03
   7.08994547e-02  4.03429340e-02 -2.12082910e-02 -3.84257778e-02
  -6.48868839e-03  4.80954113e-02]
 [ 2.70410354e-02  1.68919133e-02  2.52986469e-02  3.65993232e-02
   4.94545211e-02  6.56528268e-02 -3.45131361e-03 -1.81844337e-02
  -2.00468908e-02  1.84664070e-02]
 [ 1.33348089e-02  3.98205302e-03  3.65993232e-02  8.63566931e-02
   9.67985927e-03  1.00685091e-01  2.58818405e-02  1.66212907e-02
  -4.02656116e-02 -3.16988513e-02]
 [ 1.00609001e-01  7.08994547e-02  4.94545211e-02  9.67985927e-03
   2.11236189e-01  1.17774282e-01 -6.39199532e-02 -1.15041459e-01
  -1.83299257e-02  1.44268308e-01]
 [ 6.57707762e-02  4.03429340e-02  6.56528268e-02  1.00685091e-01
   1.17774282e-01  1.71350909e-01 -3.68356893e-03 -3.98590657e-02
  -5.39476528e-02  3.79461186e-02]
 [-2.75184938e-02 -2.12082910e-02 -3.45131361e-03  2.58818405e-02
  -6.39199532e-02 -3.68356893e-03  2.90038970e-02  4.21533131e-02
  -7.67476391e-03 -5.65027401e-02]
 [-5.25695002e-02 -3.84257778e-02 -1.81844337e-02  1.66212907e-02
  -1.15041459e-01 -3.98590657e-02  4.21533131e-02  6.82317479e-02
  -6.40769360e-05 -8.83324737e-02]
 [-1.27339493e-02 -6.48868839e-03 -2.00468908e-02 -4.02656116e-02
  -1.83299257e-02 -5.39476528e-02 -7.67476391e-03 -6.40769360e-05
   1.96830541e-02  5.06165683e-03]
 [ 6.48227388e-02  4.80954113e-02  1.84664070e-02 -3.16988513e-02
   1.44268308e-01  3.79461186e-02 -5.65027401e-02 -8.83324737e-02
   5.06165683e-03  1.15614106e-01]] 

eigen_vals
 (10,) 
 [-7.03941390e-17 -3.60417070e-17 -1.30858621e-17 -8.61317229e-19
  2.07977247e-19  3.78308880e-18  1.81729034e-17  5.06232858e-17
  2.50881048e-01  5.48501886e-01] 

eigen_vecs
 (10, 10) 
 [[-8.62414327e-01 -2.56694396e-01  2.14123803e-02 -3.65464124e-03
  -3.74506544e-02 -4.03836996e-02 -1.56553129e-01 -2.70841836e-01
   1.88121436e-03 -2.98289378e-01]
 [-3.29309661e-01  3.04399817e-01  1.12224764e-01  3.59044375e-02
   2.29801242e-02  1.33075225e-01  3.25446102e-01  7.85293769e-01
   3.78517564e-02 -2.06739077e-01]
 [ 1.87819142e-01 -6.18147267e-01  1.19968071e-01 -2.98746200e-02
  -5.75006928e-02 -1.07102136e-02 -5.11284915e-01  4.84039304e-01
  -2.01735451e-01 -1.65857022e-01]
 [-4.20652772e-02  3.25105357e-01 -4.37801744e-01 -4.72579427e-01
  -1.94733425e-01  2.40711818e-01 -2.13686058e-01 -2.63117023e-04
  -5.73668120e-01 -8.31573118e-02]
 [ 2.23805479e-01  3.80401557e-01  5.16032016e-01 -4.86764808e-02
  -1.21362454e-01  1.15351645e-01 -2.77282609e-01 -1.94139729e-01
   1.27507609e-01 -6.14555453e-01]
 [ 1.59746860e-01 -2.60818962e-01  9.97801283e-02  1.09457118e-01
   1.83574048e-01 -6.67072533e-02  5.74767815e-01 -1.62150624e-01
  -5.71656816e-01 -4.03640689e-01]
 [-6.40361647e-02  1.45042422e-01  2.56135146e-01 -5.13757607e-02
  -5.28489208e-01 -7.34660430e-01  3.62060096e-02  6.29545338e-02
  -2.32922772e-01  1.67521425e-01]
 [-7.35300366e-02 -8.51947951e-02  3.21840435e-01  3.31636166e-01
  -5.14697062e-01  5.87774571e-01  8.13824674e-02 -8.23480629e-02
  -2.17117497e-01  3.20679030e-01]
 [ 1.89017271e-02 -2.92038448e-01  2.64837428e-01 -7.90568092e-01
  -1.40259145e-01  1.40116195e-01  3.28628418e-01 -2.80073859e-02
   2.54874036e-01  7.85652485e-02]
 [ 1.58992560e-01 -1.65334248e-01 -5.15086352e-01  1.50011745e-01
  -5.86964321e-01 -3.15471259e-03  1.98312007e-01  1.23421956e-02
   3.45496594e-01 -3.95200626e-01]] 

idx_sorted
 (10,) 
 [0 1 2 3 4 5 6 7 8 9] 

idx_sorted_decreasing
 (10,) 
 [9 8 7 6 5 4 3 2 1 0] 

eigen_vals_sorted
 (10,) 
 [ 5.48501886e-01  2.50881048e-01  5.06232858e-17  1.81729034e-17
  3.78308880e-18  2.07977247e-19 -8.61317229e-19 -1.30858621e-17
 -3.60417070e-17 -7.03941390e-17] 

eigen_vecs_sorted
 (10, 10) 
 [[-2.98289378e-01  1.88121436e-03 -2.70841836e-01 -1.56553129e-01
  -4.03836996e-02 -3.74506544e-02 -3.65464124e-03  2.14123803e-02
  -2.56694396e-01 -8.62414327e-01]
 [-2.06739077e-01  3.78517564e-02  7.85293769e-01  3.25446102e-01
   1.33075225e-01  2.29801242e-02  3.59044375e-02  1.12224764e-01
   3.04399817e-01 -3.29309661e-01]
 [-1.65857022e-01 -2.01735451e-01  4.84039304e-01 -5.11284915e-01
  -1.07102136e-02 -5.75006928e-02 -2.98746200e-02  1.19968071e-01
  -6.18147267e-01  1.87819142e-01]
 [-8.31573118e-02 -5.73668120e-01 -2.63117023e-04 -2.13686058e-01
   2.40711818e-01 -1.94733425e-01 -4.72579427e-01 -4.37801744e-01
   3.25105357e-01 -4.20652772e-02]
 [-6.14555453e-01  1.27507609e-01 -1.94139729e-01 -2.77282609e-01
   1.15351645e-01 -1.21362454e-01 -4.86764808e-02  5.16032016e-01
   3.80401557e-01  2.23805479e-01]
 [-4.03640689e-01 -5.71656816e-01 -1.62150624e-01  5.74767815e-01
  -6.67072533e-02  1.83574048e-01  1.09457118e-01  9.97801283e-02
  -2.60818962e-01  1.59746860e-01]
 [ 1.67521425e-01 -2.32922772e-01  6.29545338e-02  3.62060096e-02
  -7.34660430e-01 -5.28489208e-01 -5.13757607e-02  2.56135146e-01
   1.45042422e-01 -6.40361647e-02]
 [ 3.20679030e-01 -2.17117497e-01 -8.23480629e-02  8.13824674e-02
   5.87774571e-01 -5.14697062e-01  3.31636166e-01  3.21840435e-01
  -8.51947951e-02 -7.35300366e-02]
 [ 7.85652485e-02  2.54874036e-01 -2.80073859e-02  3.28628418e-01
   1.40116195e-01 -1.40259145e-01 -7.90568092e-01  2.64837428e-01
  -2.92038448e-01  1.89017271e-02]
 [-3.95200626e-01  3.45496594e-01  1.23421956e-02  1.98312007e-01
  -3.15471259e-03 -5.86964321e-01  1.50011745e-01 -5.15086352e-01
  -1.65334248e-01  1.58992560e-01]] 

eigen_vecs_subset
 (10, 2) 
 [[-0.29828938  0.00188121]
 [-0.20673908  0.03785176]
 [-0.16585702 -0.20173545]
 [-0.08315731 -0.57366812]
 [-0.61455545  0.12750761]
 [-0.40364069 -0.57165682]
 [ 0.16752142 -0.23292277]
 [ 0.32067903 -0.2171175 ]
 [ 0.07856525  0.25487404]
 [-0.39520063  0.34549659]] 

X_reduced
 (3, 2) 
 [[ 0.43437323  0.49820384]
 [ 0.42077249 -0.50351448]
 [-0.85514571  0.00531064]]

If you’re still unable to find the solution, please pinpoint the variable you got stuck.

Note: you can add print statements for simple debugging, just don’t forget to remove them before submitting the Assignment.

Good luck!

2 Likes

Why are 2 hints completely misleading in the assignment - the hint part is not straightforward as stated
for instructors please

Hi @Fred_Hannoyer

Yes, the hints part is not straightforward but should it? And I would disagree that it’s “completely misleading”.

I see it as the course creators especially wanted the learners to pay attention over which dimensions the following operations should take place. (especially when mentioning like "By default rowvar is True). They could easily have given all the answers (complete solutions) away but would that make the best learning?

Ok maybe my choice of word was harsh - but it is at least misleading as such. The formulation is “use …axis=None)” and then only 2 sentences after “you need to specify the optional parameter” but that could also be misunderstood if you don’t see that this is just a copy paste of the function prototype with its default value as if the hint said “this is why we specified axis=None in the hint” or “we got people reporting issues with different python configurations and different default values… so we prefer to ask you to specify the parameter”
I would reformulate “Use the function numpy…axis=None(default)) … please choose carefully the optional argument for this exercise…”
By the way I am not asking to give the complete answer in the hint but clarify that you are not giving it :slight_smile: - and just pointing out a problem in the user experience. Many of the hint functions used in the assignment have not been seen with these parameters in any content before the assignment, so students are desperate and don’t question a generous hint.
Hope this helps and thank you for the quick reply and very high quality material

Hi @Fred_Hannoyer

I understand your point and I’m grateful that you clarified it as being well-intentioned :+1:

I have not made myself entirely clear - my main point of the response is that it’s a grey area and different people may prefer different formulations while others would find it misleading. I’m not a native English speaker so it’s not up to me to discuss sentence/grammar but I can try to explain what I think:

If we read “Use numpy.mean(a,axis=None)”… period, then yes, that would be misleading but in reality the full sentence is:

  • “Use numpy.mean(a,axis=None) which takes one required parameter.”

And, to me, the “required parameter” makes big difference. Same goes for:

  • “Use numpy.cov(m, rowvar=True) which takes one required parameter.”

The following sentences in each paragraph are all about the parameter so I would argue that it’s not very misleading.

I think you have a good point about consistency of the hints:

And I agree that some hints are giving out the solution more easily while others are cryptic still.

Anyways, thank you for your input and it will be not forgotten :slight_smile:

Couple of additional points:

  1. lecture notes say to demean and remove standard deviation (1/sigma), but the regression tests seem to fail if you do that (and you don’t get the same intermediate results as arvyzukai above)
  2. The lecture notes say to use SVD, but don’t use np.svd, use np.eigh, otherwise the second eigenvector has the opposite sign (completely arbitrary and unimportant, but the regression test is not robust against this)

Hi @David_Fox

These are excellent points! :+1: It seems that these aspects could have been explained more clearly in the assignment.

Regarding point 1:
Indeed, the instructions state to “de-mean” the data, while the code comment suggests “mean centering the data.” Technically, it is accurate and does not require “standardizing” the data. However, it is understandable if learners mistakenly associate it with standardization, as it is a common default when performing PCA.

Regarding point 2:
Yes, this occasionally leads to confusion.

Thank you for bringing these concerns to attention. As the meme suggests, PCA can indeed be confusing! :slight_smile:

C1_W3_Assignment:UNQ_C5 GRADED FUNCTION: compute_pca
I have found some of the hints misleading, also. Also, you have to have deep knowledge of PCA.
First, you only to sort once and get the reordering index. The data suggests I did not have to reverse the reordering index.

Also, can you clarify to everyone that since most of us don’t do PCA daily, the reordering index must reorder the eigenvectors’ columns and not the rows of the eigenvectors? It took me 24 hours to realize that. We need more information about this in the lecture.

What’s the deal with the “np.linalg.svd” and “np.linalg.eigh” functions? They are not equivalent under equivalent conditions.