C3W1 - Confused with Embedding and Mean Layers

I am confused with NLP C3W1 introduction, I don’t understand how embedding for each word has multiple dimensions and needs to take a mean on them (Per discussion on https://community.deeplearning.ai/t/trax-mean-layer/230473: the embedding size is 2 (each column represent different embedding/feature)). From C2, embedding is a column or row of the weight matrix, ether the 1st or the 2nd one assume we use two layers.

in the instruction, does it imply we use two layers NN, and take a mean of these two?

I don’t understand why the embedding size is 2?

The “embedding size = 2” does not mean the model has two layers. It simply means each word is represented by a 2-dimensional embedding vector, chosen so the course can easily visualize the embedding space.

Every word becomes a point in a 2D space, and the Mean layer computes the average of these vectors across all words in a sentence, producing one fixed-size representation. This averaged vector is then passed to a simple dense layer for classification. The key idea is that we are averaging features, not averaging layers — the embedding dimension is just the size of the word vectors.

1 Like

got it, thanks

1 Like

The model already knows the right answer. The problem is stability under compression. Compression-Aware Intelligence (CAI) measures that

Language models don’t contain “right answers”. They contain correlations between sequences of words.

3 Likes