I understand that the word embeddings learnt by an neural network usually is not humanly interpretable, so let’s just call the first element in a vector “property 1”, and the second “property 2”, etc. Now, how is the order of these “properties” determined? Assume we have learnt the embedding matrix E, and if we switch two rows of this matrix, this matrix should be equally “correct” in the sense that it preserves the relative location of any two vectors. In other words, if we run the same learning algorithm on the same training data set, how can we guarantee the learning result remains same?
It all depends on the random initialization of the weights for Symmetry Breaking. If you take the same dataset and run the same training process but with a different random initialization, you may end up with a different solution that is equivalent in the way you describe: which actual property ends up in which position of the embedding vectors may be different, but it will learn the same properties. This is an example of the “weight space symmetry” argument described on this thread, which applies to all NN architectures.