So, what is word embeddings?

Hi @someone555777

No, they are not.
They (1000 of them for each of the 40k words in your example) are just float numbers that best fit the training (the loss function).
In other words, the training process tries to change this (40_000 x 1_000) embedding weight matrix (and other layers’ matrices) as much as possible to fit the data (by minimizing the loss function).

A similar example from Course 3 that might help - here the embedding dimension is 50 and they are not from vocabulary or anywhere else - they were initially randomly created and updated accordingly - lowered or increased if the prediction matched the target.

In your picture, this matrix is sideways (meaning the features are 4 - Gender, Royal, Age and Food; and the vocab size is 6 - Man, Woman, King, Queen, Apple and Orange) - in other words, the features are usually the columns. And in your picture the 4 features are just for illustration purposes - in reality they are not that interpretable - instead they would be 0, 1, 2, 3 (and not any word from the vocabulary or any word at all).

Cheers