What is the embedding layer dimension in C3 Wk 1?

pablowilks · December 4, 2021, 11:27pm

In Detecting sarcasm assignment, a single training example is len 120 (after padding), what dimension is the embedded layer? is it (120, 16)? If so, why does the vocab size need to be set in the layer as well?

tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length)

If the vocab size doesn’t influence the dimensions of this particular embedding why the need for it to be there?

Also, what will be the dimensions of the output from this layer be?

Many thanks

german.mesa · December 10, 2021, 7:50am

Hi @pablowilks,

We use embedding to reduce the dimensionality of the space. Initially we have a big chunk of words - padded vocabulary - and we reduce it to smaller space.

For our particular example, if we have a vocabulary of 1,000 words used in a sarcasm collection, we could learn 16-dimensional embeddings for each word using an embedding neural network trained to predict the sarcasm of the reviews. Words in the vocabulary that are associated with sarcastic reviews such as “brilliant” or “excellent” will come out closer in the embedding space because the network has learned these are both associated with that kind of reviews.

I guess the best way to understand why you need to provide vocabulary size is to use model.summary(). For a model with vocabulary size 1000 and embedding dim of 16:

# Layer (type)                 Output Shape              Param #   
# =================================================================
# embedding (Embedding)        (None, 120, 16)           16000     
# _________________________________________________________________
# global_average_pooling1d (Gl (None, 16)                0         
# _________________________________________________________________
# dense (Dense)                (None, 24)                408       
# _________________________________________________________________
# dense_1 (Dense)              (None, 6)                 150       
# =================================================================

Hope it’s not too confusing

Topic		Replies	Views
How to decide optimal values of hyperparameters for embedding layer(output vector dimension and max length) based on data? Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	2	409	October 1, 2023
Model embedding layer dim Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	2	619	March 29, 2022
Embedding Layer input and output meaning Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	5	737	April 17, 2022
Input dimension for Embedding Layer Week1 Assignment NLP with Attention Models week-module-1	2	294	January 8, 2024
No input length in Lab 3 Embedding layer Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	4	598	May 2, 2022

What is the embedding layer dimension in C3 Wk 1?

Related topics