I am confused with NLP C3W1 introduction, I don’t understand how embedding for each word has multiple dimensions and needs to take a mean on them (Per discussion on https://community.deeplearning.ai/t/trax-mean-layer/230473: the embedding size is 2 (each column represent different embedding/feature)). From C2, embedding is a column or row of the weight matrix, ether the 1st or the 2nd one assume we use two layers.
in the instruction, does it imply we use two layers NN, and take a mean of these two?
The “embedding size = 2” does not mean the model has two layers. It simply means each word is represented by a 2-dimensional embedding vector, chosen so the course can easily visualize the embedding space.
Every word becomes a point in a 2D space, and the Mean layer computes the average of these vectors across all words in a sentence, producing one fixed-size representation. This averaged vector is then passed to a simple dense layer for classification. The key idea is that we are averaging features, not averaging layers — the embedding dimension is just the size of the word vectors.