Embedding Layer input and output meaning

ying_yan1 · April 17, 2022, 9:36am

# Build the model
model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, 16, input_length=120), 
    tf.keras.layers.GlobalAveragePooling1D(),
    tf.keras.layers.Dense(6, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

Why we pass the vocab_size as the first parameter into the Embedding ?

In Deep Learning Classes, the teacher says the Embedding layer takes an integer matrix of size (batch size, max input length) as input. In my mind, we should pass the batch size into the Embedding as the first parameter why the coding passing the vocab_size？What’s the meaning of it ?
Thank you very much ! I am so confused with the Embedding Layer.

balaji.ambresh · April 17, 2022, 9:45am

Please read this link

ying_yan1 · April 17, 2022, 10:37am

Thank you very much! I think i understand this. The vacab_size is the largest number in the interger matrix instead of the batch_size.

ying_yan1 · April 17, 2022, 11:13am

# Get the index-word dictionary
reverse_word_index = tokenizer.index_word

# Get the embedding layer from the model (i.e. first layer)
embedding_layer = model.layers[0]

# Get the weights of the embedding layer
embedding_weights = embedding_layer.get_weights()[0]

# Print the shape. Expected is (vocab_size, embedding_dim)
print(embedding_weights.shape) 

for word_num in range(1, vocab_size):

  # Get the word associated at the current index
  word_name = reverse_word_index[word_num]

  # Get the embedding weights associated with the current index
  word_embedding = embedding_weights[word_num]

  # Write the word name
  out_m.write(word_name + "\n")

  # Write the word embedding
  out_v.write('\t'.join([str(x) for x in word_embedding]) + "\n")

the function embedding_layer.get_weights()[0] returns the W value of the layer. If we want to get the vectors of the words, should we muiltiply weights with the input (the interger matrix we feed into the model) ? In the code above , we just regard the value of W as the word embeddings?

balaji.ambresh · April 17, 2022, 11:41am

Given an input sequence (i.e. padded sequence of integers representing words), embedding_layer will take care of emitting the corresponding embeddings. Just focus on the architecture.

balaji.ambresh · April 17, 2022, 12:04pm

Moved this post to TF1.

Topic		Replies	Views
Conflicting data on Embedding() input/output Sequence Models	1	494	October 4, 2022
Why are the parameters & shape of Embedding layer different for subwords? Natural Language Processing in TensorFlow week-2 , week-3 , week-4	1	555	February 4, 2023
Natural Language Processing & Word Embeddings Sequence Models	2	664	March 6, 2022
No input length in Lab 3 Embedding layer Natural Language Processing in TensorFlow week-2 , week-3 , week-4	4	598	May 2, 2022
TF1,c3, w2 - Embedding layer param Natural Language Processing in TensorFlow week-2 , week-3 , week-4	1	500	March 3, 2023

Embedding Layer input and output meaning

Related topics