Does embedding projector use dimensional reduction?

tbhaxor · January 27, 2023, 2:33am

In the nlp course I learnt embedding layer is used to present a word in n dimensional space it could be any natural number. But we can visualize 3d, so the https://projector.tensorflow.org/ project nd data to 3d right? If yes does this mean some information would be lost?

rmwkwok · January 27, 2023, 2:37am

Hello @tbhaxor,

My answer is yes and yes.

From here we can see that it first uses PCA to find those principal “components”, where each component carries some percentage of the total variance. Then we can select up to 3 components to represent a data point. So, we are going to lose the information that the other components carry.

Raymond

Samuel_Chazy · January 27, 2023, 5:39am

Hi @rmwkwok @tbhaxor ,
You are absolutely right. If we use PCA, T_SNE, etc… there will be a loss of information because we are trying to capture the maximum variance of the data using those methods. It is a trade off between having hundreds of variables to deal with and not being able to visualize them, vs having fewer axis/principle components to deal with. So complexity vs simplicity with a bit of loss is worth it.

Christian_Simonis · January 27, 2023, 5:21pm

In addition to the very good previous answers, I want to add a few things due to completeness reasons:

Often yes (probably 99% of all examples I saw in data science), but only if the dimensional space is lower than before in the original space. So this statement is not true without exception…

Let’s take a PCA. It’s closely related to the implementation of a singular value decomposition: one application here is a modal transformation in structural dynamics, which is done to decouple interactions in the system so that it is easier to analyse e.g. eigenfrequencies, eigenmodes etc. in a simpler way.

(Often of course the benefit of model order reduction is used in this context, too! Almost the same accuracy with a way better computational performance can be achieved if done well which is often the way to go). But I want to highlight that singular value decomposition (SVD) or PCA could also be theoretically performed in the full original space without loss of information because in the end only a linear transformation is done.

If you are interested in structural dynamics, feel free to take a look.

Hope that helps!

Best regards
Christian

Samuel_Chazy · January 30, 2023, 9:58am

Hi @Christian_Simonis,
In your opinion and experience, what the best dimensionality reduction algorithm to use: PCA, LDA, SVD, T-SNE, UMAP, etc…

Christian_Simonis · January 30, 2023, 11:19am

Hi @Samuel_Chazy

I believe this really depends on what you want to achieve. Do you have a concrete example (e.g. a visualization or anomaly detection, etc…) what you want to do after doing the reduction of the feature space?

Since you asked for my personal experience:

Personally, I think PCA is great since it helps for data understanding and when it comes to transformations it’s interpretable due to linearity operations, but things are getting more difficult when you want to deal with non-linearity. Kernel PCA might help here but if you have really large amount of data you could in this case also think about a deep learning based approach to learn your embeddings, e.g. with autoencoders, which can be very effective!

Note: e.g. in case if only a small amount of labels is available, sometimes Siamese networks can be very powerful to learn embeddings dependent on what you want to achieve and if you fulfill the data requirements.

I also played around with some other methods you outlined like t-sne, but only for visualization purposes.

Best regards
Christian

Samuel_Chazy · January 30, 2023, 11:43am

Hi @Christian_Simonis,

I agree with what you said in there. I mostly use PCA and kernel PCA when I deal with data that has less than 1,000 columns. I haven’t worked on bigger datasets. Datasets that I work on are usually numerical, so I am not sure if auto-encoders would be a good option. Thanks!

Topic		Replies	Views
Does having multiple embedding in model does not cause any affect to training model? Understanding and Applying Text Embeddings with...	4	163	September 11, 2023
PCA question Linear Algebra for Machine Learning and Data Sc... week-module-4	3	63	October 23, 2024
Physical meaning of PCA components Unsupervised Learning, Recommenders, Reinforcement week-module-2	3	284	December 7, 2023
Question on Dot product to project data NLP with Classification and Vector Spaces week-module-3	1	350	December 24, 2021
PCA interpretation over word embeddings NLP with Probabilistic Models week-module-4	3	526	July 27, 2023

Does embedding projector use dimensional reduction?

Related topics