Hi,
I have a doubt in NLP week 2 practice quiz question.
Here is the question: -
“When it comes to translating French to English using dot product attention:”
Here are the options: -
- The intuition is that each query qi, picks most similar key kj. This allows the attention model to focus on the right words at each time step.
- The queries are the French words, and the keys and values are the English words.
- A CPU is more than enough to train this type of model.
- You find the distribution by multiplying the queries by the keys (you might need to scale), take the Softmax and then multiply it by the values.
According to me the correct options for the above MCQ should be 1 and 4 only but the feedback says I have not selected all the correct options. (score is 0.75/1)
I think third option is not correct because our queries would be English words and keys and values would be French words as we are translating French to English.
Can someone correct me if I am wrong?