Course 5 Week 4 scaled_dot_product_attention()

I was able to pass all tests but I have a doubt. The instructions state that dk is the dimension of the keys. I initially used dk = np.size(k), but this didn’t work. So I tried dk = np.size(k[1]), and then all tests passed. Basically I’m unsure why dk = the size of the second dimension of k, seq_len_k, instead of the entire k.

Hi @TheLonelyChemE

dk should be set to np.size(k)[0], it just happens that both dimensions [0] and [1] are of the same value for this unit test that your code were able to pass all the tests.

1 Like

Hi Kin. Thank you for clearing up my misconception. I have made the correction to my code. However, I am still unsure why we use only the first part of k instead of the entire k if dk is supposed to be the dimension of the keys.

Hi @TheLonelyChemE ,

The comment line from that function tells us:
key shape == (…, seq_len_k, depth).
where the first element is the number of rows of K, which is the number of key types that used to describe a word in the word sequence.
dk is a scale to matmul_qk, it is not the dimension of the keys.