NLP C1 W3 "Lab: Another explanation about PCA" misprint in "PCA as a strategy for dimensionality reduction"?

g.dychko · June 10, 2022, 8:42am

Dear colleagues,

Could you please help me with " PCA as a strategy for dimensionality reduction" section in “Lab: Another explanation about PCA”?
It seems like comments describe this part as

In the next figure, we can see the original data and its corresponding projection over the first and second principal components. In other words, data comprised of a single variable.

but the code plots not projected data, but the input.

nPoints = len(data)

# Plot the original data in blue
plt.scatter(data[:,0], data[:,1])

#Plot the projection along the first component in orange
plt.scatter(data[:,0], np.zeros(nPoints))

#Plot the projection along the second component in green
plt.scatter(np.zeros(nPoints), data[:,1])

plt.show()

Should’t we use dataPCA instead of just data in the code above to actually plot the projections?

nPoints = len(data)

# Plot the original data in blue
plt.scatter(data[:,0], data[:,1])

#Plot the projection along the first component in orange
plt.scatter(dataPCA[:,0], np.zeros(nPoints))

#Plot the projection along the second component in green
plt.scatter(np.zeros(nPoints), dataPCA[:,1])

plt.show()

Thank you in advance!
With best regards,
Halyna

reinoudbosch · June 11, 2022, 10:14pm

Hi Halyna,

As I read it, the aim of the section on PCA as a strategy for dimensionality reduction is to show how the number of dimensions can be reduced using dimension axes as principal components. What the figure shows is how you can go from the blue scattered points in a two-dimensional space to dots on an orange horizontal line - if that is the one-dimensional space you want to reduce to, or to dots on a green vertical line - if that is the one-dimensional space you want to reduce to.

To be consistent with the preceding PCA analysis, the reduction should have occurred according to the two principal components shown in the chart just above. In other words, the blue dots should have been projected onto the two eigenvectors, indicating that dimensionality reduction could occur to one of these eigenvectors, in principle the one that retains most information, i.e. [0.70827652 0.7059351]

g.dychko · June 13, 2022, 8:24am

Dear Reinoud,

Thank you so much for your explanation!
It makes sense, what you’ve just described with dimensionalities reduction.

I’m so sorry, perhaps it would be much clear and reduce further students’ questions if your comment “the number of dimensions can be reduced using dimension axes as principal components” was added to the .ipynb instead of just “…its corresponding projection over the first and second principal components”.

Thank you very much for your detailed answer!
I truly appreciate it!
With best regards,
Halyna

reinoudbosch · June 13, 2022, 9:41pm

Dear Halyna,

That is a great suggestion. I’ll pass it on to people working on the backend.

Thanks!

Topic		Replies	Views
C1W4 final assignment: section 2.5 and onwards regarding reconstructing images Linear Algebra for Machine Learning and Data Sc... week-module-4	1	41	December 31, 2024
C3_W2 - PCA Question Unsupervised Learning, Recommenders, Reinforcement week-module-2	3	686	February 21, 2023
PCA question Linear Algebra for Machine Learning and Data Sc... week-module-4	3	63	October 23, 2024
Stuck with dimensions on compute_pca NLP with Classification and Vector Spaces week-module-3	32	886	October 4, 2023
PCA features plotted figure NLP with Classification and Vector Spaces week-module-3	2	306	December 31, 2022

NLP C1 W3 "Lab: Another explanation about PCA" misprint in "PCA as a strategy for dimensionality reduction"?

Related topics