W2 - Assignment 2: In second part, why do we get the index of the words instead of accessing the GloVe vectors directly from word_to_vec_map?

Harvey_Wang · November 29, 2022, 1:55am

In sentences_to_indices, we use word_to_index to get the index of the words. And then we run the indices through the Embedding layer to get the word vectors.

Why can’t we use word_to_vec_map directly? It has all the words and their respective vectors.

kchong37 · November 29, 2022, 2:12am

According to the arguments’ explanations in exercise 4:

  word_to_vec_map -- dictionary mapping words to their GloVe vector representation.
  word_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)

Harvey_Wang · November 29, 2022, 2:27am

I still don’t understand. This arguments’ explanations are what confuses me. If we’re getting word vectors from the embedding layer, why can’t we just skip that layer and get the vectors directly from word_to_vec_map?

Harvey_Wang · November 29, 2022, 2:56am

So, in part 2, we create an embedding matrix from the GloVe vectors. But we also take the indices from of each word using word_to_index, just to access the GloVe vectors. Why can’t we just get the GloVe vectors from the words in word_to_vec_map directly?

kchong37 · November 29, 2022, 2:24pm

I see your point. Sorry that I misunderstood your question earlier.
I think the embedding layer is still needed for the training purpose? We didn’t further train the embedding values in our assignment because the data set is not large enough.

Harvey_Wang · November 29, 2022, 4:56pm

That’s what my interpretation was after reading the instructions again. I’m assuming we need the indices in general because the embedding matrix may have different values from the GloVe vectors. It seems like making the embedding matrix out of the vectors was more out of convenience and it won’t always be the case that the matrix has the same values as a corresponding GloVe vector.

Topic		Replies	Views
Embedding Layer Transfer Learning Natural Language Processing in TensorFlow	1	311	November 29, 2022
C5 W2 A2 GloVe Sequence Models week-module-1 , week-module-2 , coursera-platform	4	265	February 9, 2024
Emojify exercise 4 embedding layer Sequence Models coursera-platform	14	1188	June 11, 2021
Question on Sentiment Classification Lecture Sequence Models week-module-2 , coursera-platform	6	263	January 19, 2024
Why multiply one hot encoder with word embedding matrix when we can just extract the column? Sequence Models coursera-platform	2	558	December 18, 2024

W2 - Assignment 2: In second part, why do we get the index of the words instead of accessing the GloVe vectors directly from word_to_vec_map?

Related topics