So, what is word embeddings?

arvyzukai · August 21, 2023, 12:48pm

No, they are not.
They (1000 of them for each of the 40k words in your example) are just float numbers that best fit the training (the loss function).
In other words, the training process tries to change this (40_000 x 1_000) embedding weight matrix (and other layers’ matrices) as much as possible to fit the data (by minimizing the loss function).

A similar example from Course 3 that might help - here the embedding dimension is 50 and they are not from vocabulary or anywhere else - they were initially randomly created and updated accordingly - lowered or increased if the prediction matched the target.

In your picture, this matrix is sideways (meaning the features are 4 - Gender, Royal, Age and Food; and the vocab size is 6 - Man, Woman, King, Queen, Apple and Orange) - in other words, the features are usually the columns. And in your picture the 4 features are just for illustration purposes - in reality they are not that interpretable - instead they would be 0, 1, 2, 3 (and not any word from the vocabulary or any word at all).

Cheers

Topic		Replies	Views
[ELI5] What is embedding? Generative AI with Large Language Models week-1	5	487	December 5, 2023
General Question about Vector Space NLP with Classification and Vector Spaces week-3	5	332	August 13, 2022
Understanding Word Embeddings Sequence Models week-2	2	245	February 14, 2024
Extracting Word Embedding Vectors — what do we do? NLP with Probabilistic Models week-4	2	476	July 15, 2023
How to understand this describe about word embeddings? Sequence Models week-2	3	259	February 20, 2024

So, what is word embeddings?

Related topics