The representation vector of the 10th text chunk has 358 dimensions

Iulia_Brezeanu · January 6, 2024, 5:11pm

In the second video, minute 7, Anton Troynikov says that the representation vector of the 10th text chunk has dimensions of 358. But when I run
len(embedding_function([token_split_texts[10]][0]))
I get 946. Is my understanding of the number of dimensions of a vectorial representation of a text chunk wrong? Shouldn’t it just be the length?

vivek37.kumar · January 7, 2024, 3:18am

I think there is a mistake in the code you have written.
Try this:
print(len(embedding_function([token_split_texts[10]])[0]))

Iulia_Brezeanu · January 7, 2024, 6:33am

Thank you. I modified my code to this:

arr = np.array(embedding_function(token_split_texts[10]))
arr.shape

and got (946, 384).
I still don’t understand how to interpret this. There are 946 vectors of dimension 384? It still doesn’t make sense that the vector should have a dimension of 358.

Topic		Replies	Views
N dimensional vector? Advanced Learning Algorithms week-2	1	481	July 31, 2022
Emojify exercise 4 embedding layer Sequence Models	14	1180	June 11, 2021
C4- confused about multi dimensional array representation and coding Convolutional Neural Networks	1	518	May 12, 2022
Positional Embedding: C5W4 Ex2 Deciding the shape of pos embeddings Sequence Models	3	524	February 23, 2023
Conceptual questions about encoder / decoder from the "Generating text with transformers" video Generative AI with Large Language Models week-1	1	196	April 13, 2024

The representation vector of the 10th text chunk has 358 dimensions

Related topics