Why is GloVe not too expensive?

Meir · June 14, 2021, 11:51am

In the glove formula, we have a double summation. In the given example with the dictionary of 10,000 words, we are talking about a hundred million terms. Why is this model not too expensive to optimize?

arosacastillo · June 27, 2021, 6:54pm

Hello Meir,

It is a good practice to always think about the performance of our models, so your question makes totally sense. I am not an expert but I will share my thoughts here with you. Regarding NLP models, they all need to train and to process high volumes of text data, but the difference is on how they extract the knowledge from the corpus. GloVe transforms the corpus into some kind of mathematical multidimensional space, where you do not work with words anymore but embeddings. However think now about other models such as BERT or GPT-2 which are complex Neural Network Architectures with Billions of paremeters in some cases. It is less expensive trying to optimize a GloVe model rather than a GPT-2 one, in particular regarding computer time and hardware resources that you will need.

I recommend you to read this interesting article, explaining many details of the model.

Hope this gave you more context.

Best,

Rosa

Meir · June 27, 2021, 7:17pm

Hi Rosa,
Thank you for your input.
I am still not sure I understand why this is different from word2vec, where we were concerned about the expensive summation in SoftMax and introduced Negative Sampling just for that reason…
Meir

arosacastillo · June 27, 2021, 7:34pm

Hi Meir,

Ah then your question was “why is GloVe less expensive than Word2Vec?”. I think the answer is here. So you were right that in terms of embeddings performance they are similar, but the key is the implementation.

Hope this helps

Rosa

Topic		Replies	Views
Glove Word Vector Algorithm Sequence Models coursera-platform	1	540	May 10, 2022
W2 A2 about GloVe advantage and disadvantage Sequence Models coursera-platform	3	462	August 10, 2023
GloVe Neural Network Architecture Sequence Models coursera-platform	12	637	March 22, 2023
W2 - Assignment 2: In second part, why do we get the index of the words instead of accessing the GloVe vectors directly from word_to_vec_map? Sequence Models coursera-platform	5	538	November 29, 2022
[DL5 Week 2 Assignment2] Were there recent changes in the assignment? Sequence Models coursera-platform	2	571	July 28, 2022

Why is GloVe not too expensive?

Related topics