I have a quick question regarding to the meaning of d_k. IIUC, d_k is the length of the key vector, and it is different from the embedding dimension. In UNQ_C1, I passed all tests with depth = query.shape[-1]
, but is it supposed to be query.shape[-2], as query.shape[-1] is the embedding dimension?
Thanks a lot!