W2 - Assignment 2: In second part, why do we get the index of the words instead of accessing the GloVe vectors directly from word_to_vec_map?

In sentences_to_indices, we use word_to_index to get the index of the words. And then we run the indices through the Embedding layer to get the word vectors.

Why can’t we use word_to_vec_map directly? It has all the words and their respective vectors.

According to the arguments’ explanations in exercise 4:

  word_to_vec_map -- dictionary mapping words to their GloVe vector representation.
  word_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)

I still don’t understand. This arguments’ explanations are what confuses me. If we’re getting word vectors from the embedding layer, why can’t we just skip that layer and get the vectors directly from word_to_vec_map?

So, in part 2, we create an embedding matrix from the GloVe vectors. But we also take the indices from of each word using word_to_index, just to access the GloVe vectors. Why can’t we just get the GloVe vectors from the words in word_to_vec_map directly?

I see your point. Sorry that I misunderstood your question earlier.
I think the embedding layer is still needed for the training purpose? We didn’t further train the embedding values in our assignment because the data set is not large enough.

1 Like

That’s what my interpretation was after reading the instructions again. I’m assuming we need the indices in general because the embedding matrix may have different values from the GloVe vectors. It seems like making the embedding matrix out of the vectors was more out of convenience and it won’t always be the case that the matrix has the same values as a corresponding GloVe vector.