C3 W1 Assignment Model intuition

gkouro · December 28, 2022, 12:27pm

I am trying to understand the logic of the model architecture. The input seems to be a vector of vocab id’s for each word in the tweet (in batches). Then this vector is passed on to the embedding layer and the output is a matrix (or batch of matrices) of shape (for each member of the batch): Number of tokens x Embedding dims. This matrix (or batch of matrices) is then passed on to the Mean layer and we get a mean of each embedding column for the tokens in a tweet. So the output is 1xEmbedding dims (times batches). Then fed in a Dense layer and then softmaxed. Are these steps correct?

If yes, then is the embedding layer constructed in a way to expect vectors of id’s and return the respective embedding vector for each token?

Thanks

Elemento · December 29, 2022, 7:15am

Hey @gkouro,
Yes, these steps are correct.

tweet_to_tensor converts a sentence of words into a list of IDs, where each word is mapped to it’s ID from the vocabulary
data_generator uses this function for each of the selected positive and negative tweets in a batch from all the examples. It also makes sure that each sentence has a uniform length for it’s list of IDs, by appending zeros.
Now, if we pass say a input of dimensionality (32, 10) to the embedding layer, it means that we have 32 samples, each represented by a list of length 10.
Say the embeddings of words are represented by 100 features. So the embedding layer’s output will have a dimensionality of (32, 10, 100).
This output is fed to a Mean layer, which returns a (32, 100) dimensional input. In other words, it takes the average of the word embeddings of the different words in a sentence.
Then Dense, followed by Softmax

Yes, it is also mentioned in the output of help(tl.Embedding).

Trainable layer that maps discrete tokens/IDs to vectors.

I hope this helps.

Cheers,
Elemento

Topic		Replies	Views
Creating embeddings of entire tweets NLP with Sequence Models week-module-1	5	519	February 15, 2023
Question about Dimension of Model Input NLP with Sequence Models week-module-1	1	534	August 10, 2022
C3W1 - Confused with Embedding and Mean Layers NLP with Sequence Models week-module-1	4	56	November 10, 2025
Subtle, confusing errors in C3W1 notebook explanation of Mean NLP with Sequence Models week-module-1	2	501	March 28, 2023
How can Dense layer can accept varying number of dimensions? NLP with Sequence Models week-module-1	1	570	July 1, 2022

C3 W1 Assignment Model intuition

Related topics