With a skipgram, the first weight matrix between the input and hidden layer is what we keep as the word embeddings. But CBOW is the opposite, with context words predicting target word. So which weights in the network are we keeping for the word embeddings?
Hey @Max_Rivera,
The word embeddings in both the cases are actually obtained from the same weights matrix. The only difference resides in how the inputs, network and the outputs are structured. For more information about the same, you can refer to this detailed blog. Let me know if this helps.
Cheers,
Elemento