Clustering methods for embedding vectors

nhuynh2 · February 12, 2026, 5:04pm

Hi! I’m working on a project and am very new to the Natural Language Processing techniques. I have some high-dimensional embeddings generated from resumes and hope to group them. I found suggestions to use UMAP and t-SNE, but there are some warnings on their own websites. I’m seeking a reliable method that works with Python. Thanks!

paulinpaloalto · February 12, 2026, 5:24pm

Well, first you need to define “reliable”. I phrased that as a joke, but it’s a serious question: how do you define success or “good enough”? What is your actual goal?

The topic of vector embeddings (both how to create them and how to use them) is discussed in some detail at several points in the courses offered here. For example, it is one of the major topics in DLS Course 5 Sequence Models. I’m sure it is also covered in the NLP specialization.

There are two standard metrics for evaluating the similarity between different embedding vectors: cosine similarity and Euclidean distance. I do not have personal experience with this beyond the material in the courses here. My suggestion would be to start by taking DLS C5 and learn what is taught there. That will at least give you a framework for interpreting the warnings on the two websites that you mention above.

nhuynh2 · February 13, 2026, 7:13pm

Thank you very much for all the suggestions.

balaji.ambresh · February 14, 2026, 9:16am

Please check tensorflow embedding projector as well. This shows how one can project embeddings to a lower dimensional space for visualization / as a preprocessing step for clustering.

Topic		Replies	Views
Visualization of vector spaces NLP with Classification and Vector Spaces week-module-3	4	121	November 10, 2023
Why not use coef of correlation instead of cosine similarity between embedding vectors AI Discussions ai-discussions , llm	1	123	May 9, 2024
Dot product vs. cosine distance between embedding vectors LangChain: Chat with Your Data	2	368	July 13, 2023
Why do the embeddings cluster Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	1	436	July 3, 2023
Question about Embeddings and Cosine Similarity NLP with Classification and Vector Spaces week-module-3	1	332	February 7, 2023

Clustering methods for embedding vectors

Related topics