The fact that it’s possible to map words from one language to another one using the same transformation matrix is quite surprising, and worth emphasizing.
This has implications regarding a common structure to all languages, otherwise it would not work to translate unseen words this way.
Does anyone know more about this?
Has anyone experienced with different word embeddings?
Does the same matrix R work for different training instances of the same pair of languages?
Does the same matrix R work for, say, translating English to French using either wordvec or glove pretrained vectors?