C1 W3 Exercise 5 Expected Results

MattHo · April 25, 2023, 8:45pm

Hello everyone,

Summary

I’m stuck with the fifth exercise compute_pca. I see that it is troubling others as well and I’m unable to find enough help in these forums. My expected output is incorrect

The problem:

When I execute the following cell

# Testing your function
np.random.seed(1)
X = np.random.rand(3, 10)
X_reduced = compute_pca(X, n_components=2)
print("Your original matrix was " + str(X.shape) + " and it became:")
print(X_reduced)

My result is

[[ 0.23132424  0.43767745]
 [ 0.2177235  -0.56404087]
 [-1.0581947  -0.05521575]]

What I have tried:

Verified that I “de-meaned” the input data (X - mean) resulting in array of shape (3, 10) for this example
Computed the covarance using numpy.cov(m, rowvar=False) resulting in a (10, 10) array for this example
Computed the eigenvalues and eigenvectors
Computed the sorted indices
Extracted the first n_components sorted eigenvectors $$U_{n_components}$$ using the [:, 0: n_components] slice syntax
Computed the matrix multiplication $$X^\prime = X_{demeaned} U_{n_components}$$ resulting in a (3, 2) array

My final figure is the following.

MattHo · April 25, 2023, 8:55pm

Furthermore, these are input and intermediate values

X (the data)

Shape: (3, 10)
Value: [[4.17022005e-01 7.20324493e-01 1.14374817e-04 3.02332573e-01 1.46755891e-01 9.23385948e-02 1.86260211e-01 3.45560727e-01 3.96767474e-01 5.38816734e-01] [4.19194514e-01 6.85219500e-01 2.04452250e-01 8.78117436e-01 2.73875932e-02 6.70467510e-01 4.17304802e-01 5.58689828e-01 1.40386939e-01 1.98101489e-01] [8.00744569e-01 9.68261576e-01 3.13424178e-01 6.92322616e-01 8.76389152e-01 8.94606664e-01 8.50442114e-02 3.90547832e-02 1.69830420e-01 8.78142503e-01]]

`X_demeaned`

Shape: (3, 10)
Value [[-0.01842585 0.28487664 -0.43533348 -0.13311528 -0.28869196 -0.34310926 -0.24918764 -0.08988713 -0.03868038 0.10336888] [-0.01625334 0.24977165 -0.2309956 0.44266958 -0.40806026 0.23501966 -0.01814305 0.12324197 -0.29506092 -0.23734636] [ 0.36529671 0.53281372 -0.12202368 0.25687476 0.4409413 0.45915881 -0.35040364 -0.39639307 -0.26561743 0.44269465]]

Covariance Matrix of `X_demeaned`

Shape: (10, 10)
Value: [[ 4.88046950e-02 3.38429177e-02 2.70410354e-02 1.33348089e-02 1.00609001e-01 6.57707762e-02 -2.75184938e-02 -5.25695002e-02 -1.27339493e-02 6.48227388e-02] [ 3.38429177e-02 2.38029956e-02 1.68919133e-02 3.98205302e-03 7.08994547e-02 4.03429340e-02 -2.12082910e-02 -3.84257778e-02 -6.48868839e-03 4.80954113e-02] [ 2.70410354e-02 1.68919133e-02 2.52986469e-02 3.65993232e-02 4.94545211e-02 6.56528268e-02 -3.45131361e-03 -1.81844337e-02 -2.00468908e-02 1.84664070e-02] [ 1.33348089e-02 3.98205302e-03 3.65993232e-02 8.63566931e-02 9.67985927e-03 1.00685091e-01 2.58818405e-02 1.66212907e-02 -4.02656116e-02 -3.16988513e-02] [ 1.00609001e-01 7.08994547e-02 4.94545211e-02 9.67985927e-03 2.11236189e-01 1.17774282e-01 -6.39199532e-02 -1.15041459e-01 -1.83299257e-02 1.44268308e-01] [ 6.57707762e-02 4.03429340e-02 6.56528268e-02 1.00685091e-01 1.17774282e-01 1.71350909e-01 -3.68356893e-03 -3.98590657e-02 -5.39476528e-02 3.79461186e-02] [-2.75184938e-02 -2.12082910e-02 -3.45131361e-03 2.58818405e-02 -6.39199532e-02 -3.68356893e-03 2.90038970e-02 4.21533131e-02 -7.67476391e-03 -5.65027401e-02] [-5.25695002e-02 -3.84257778e-02 -1.81844337e-02 1.66212907e-02 -1.15041459e-01 -3.98590657e-02 4.21533131e-02 6.82317479e-02 -6.40769360e-05 -8.83324737e-02] [-1.27339493e-02 -6.48868839e-03 -2.00468908e-02 -4.02656116e-02 -1.83299257e-02 -5.39476528e-02 -7.67476391e-03 -6.40769360e-05 1.96830541e-02 5.06165683e-03] [ 6.48227388e-02 4.80954113e-02 1.84664070e-02 -3.16988513e-02 1.44268308e-01 3.79461186e-02 -5.65027401e-02 -8.83324737e-02 5.06165683e-03 1.15614106e-01]]

MattHo · April 25, 2023, 9:02pm

{Moderator’s Edit: Please mention Lab ID only when explicitly asked by someone}

MattHo · April 25, 2023, 9:15pm

Discovery

The first and second hints for the exercise were wrong in my experience

Hints 1 and 2

 * Use numpy.mean(a,axis=None) : If you set axis = 0, you take the mean for each column. If you set axis = 1, you take the mean for each row. Remember that each row is a word vector, and the number of columns are the number of dimensions in a word vector. 
 * Use numpy.cov(m, rowvar=True) . This calculates the covariance matrix. By default rowvar is True. From the documentation: "If rowvar is True (default), then each row represents a variable, with observations in the columns." In our case, each row is a word vector observation, and each column is a feature (variable).

As soon as I changed the mean calculation numpy.mean(a, axis=0) and covariance calculation to numpy.cov(m, rowvar=False), the unit tests passed!

Elemento · April 26, 2023, 1:41pm

Hey @MattHo,
Welcome, and we are glad that you could be a part of our community Thanks a lot for letting us know that your issue has been resolved. As for the hints, they mention the functions with their default hyper-parameters. If you read the hints completely, you will find that it has been highlighted explicitly. I hope this helps.

Cheers,
Elemento

MattHo · April 26, 2023, 10:11pm

Hello @Elemento ,

As for the hints, they mention the functions with their default hyper-parameters. If you read the hints completely, you will find that it has been highlighted explicitly. I hope this helps.

Thank you for attempting to clarify the problem. I appreciate your time. Now that you explained the ‘highlight’ was intended to provide context, I should understand future hints better.

Let me critique your response as I believe the hint instructions are pedagogically flawed. I read the hints completely and multiple times and it was not clear that these were the ‘default’. Also it was not clear that the ‘highlight’ was intended to indicate for the student to vary the default. I suggest a simple rewrite like so where I use an ellipsis to indicate do not change anything else.

* Use numpy.mean which takes one required parameter. You need to specify the optional argument axis for this exercise : If you set axis = 0, [...]  in a word vector.
* Use numpy.cov which takes one required parameter. You need to specify the optional argument rowvar for this exercise. This calculates the  [...] feature (variable).

Elemento · May 3, 2023, 12:54pm

Hey @MattHo,
Thanks a lot for the follow-up. Let me pass your suggestions to the team, and they will update the hints as they deem fit to be best for the learners.

Cheers,
Elemento

Elemento · May 5, 2023, 6:11am

Hey @MattHo,
The hints have been modified for easier interpretability. Thanks once again for your feedback.

Cheers,
Elemento

Topic		Replies	Views
C1_W3_Assignment, regarding compute_pca() NLP with Classification and Vector Spaces course-related , week-3	9	308	April 26, 2024
C1W3 Assignment, PCA NLP with Classification and Vector Spaces week-3	8	471	July 19, 2023
Compute_pca function passes only 4/6 tests NLP with Classification and Vector Spaces week-3	4	242	July 23, 2023
C1_W3_Assignment_help with PCA Calculations NLP with Classification and Vector Spaces week-3	7	46	October 28, 2024
Stuck with dimensions on compute_pca NLP with Classification and Vector Spaces week-3	32	886	October 4, 2023

C1 W3 Exercise 5 Expected Results

Summary

The problem:

What I have tried:

X (the data)

X_demeaned

Covariance Matrix of X_demeaned

Discovery

Hints 1 and 2

Related topics

`X_demeaned`

Covariance Matrix of `X_demeaned`