C5 W4 Attention (Q,K,V)

dk is the dimension of the keys
dk means shape of keys (k)??

d_k refers to the dimension of the key inside the multihead attention block.
If the embedding dimension of the model is 512 and there are 8 attention heads, then, d_k=\frac{512}{8}=64