Word embedding parameters + transfer learning

Max_Rivera · May 23, 2022, 5:33pm

I don’t really understand what parameters are being learned in word embedding models like word2vec and how transfer learning works for these models. The goal is to learn parameters for word embedding matrix E, but we also have the hidden layer and softmax layer parameters that are being learned to predict the target word or context right? So when we use transfer learning are we just taking the word embedding matrix E and using it for a different application or are we taking the whole word2vec model to predict context or target for a different group of words?

balaji.ambresh · May 23, 2022, 5:59pm

Word embeddings are learnt from large corpora like wikipedia. word_2vec is an algorithm that learns these embeddings. Transfer learning on learnt word embeddings can be done in the following ways:

Take the word embeddings and use them directly in your problem. Don’t train this embedding layer but only your problem specific layers.
Fine-tune embeddings to your particular problem by initializing an embedding layer with the word embeddings and training on your dataset.

Max_Rivera · May 23, 2022, 8:22pm

Just to clarify, in the word2vec model, the learnt word embeddings are only the parameters of the first layer, not the entire word2vec model right?

balaji.ambresh · May 24, 2022, 5:10am

Sorry. I don’t understand your question. How can the entire model be the first layer of itself?

When using transfer learning using word embeddings, you’ll need 2 things from the provider:

Embedding matrix that represents a multidimensional vector for each word.
Words list where words are provided in the right order. If embedding matrix is of dimension VOCAB x EMBEDDING_DIM, the words will have to be in the order of row indices that correspond to them.

Here are the steps for transfer learning:

Include embedding matrix as the 1st layer as weights of Embedding layer. If you don’t want these embeddings to be tuned, set the trainable parameter to False for this layer.
For every input sentence, map each word to the index in the words. This will help get embeddings that correspond to that word from the embedding matrix. The work of getting the right embedding is done by the Embedding layer. You just have to provide to the right index.
Add additional layers to your model AFTER the embedding layer.
Compile & train.

Topic		Replies	Views
How do we get the embedding matrix from the word2vec model? Sequence Models coursera-platform	3	742	October 10, 2022
Natural Language Processing & Word Embeddings Sequence Models coursera-platform	3	672	May 30, 2025
[Week 2] - Embedding and Transfer Learning Sequence Models coursera-platform	6	613	May 24, 2021
Why do we need the softmax parameters in word2vec? Sequence Models coursera-platform	10	594	August 26, 2024
Some confusion on Word2Vec model NLP with Sequence Models week-module-2	1	487	July 5, 2023

Word embedding parameters + transfer learning

Related topics