Natural Language Processing with Attention Models C4W1_Assignment Exercise 2

First off, I tried making a week1 tag, but it somehow didn’t seem possible?

I’m confused about the cross attention step in Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera

specifically, because the target is shifted to the right, it has a different dimension than the context/attention output. Just using target + context as inputs to the MHA layer, I get the error:

ValueError: Exception encountered when calling layer ‘cross_attention_15’ (type CrossAttention).

Inputs have incompatible shapes. Received shapes (15, 256) and (14, 256)

Call arguments received by layer ‘cross_attention_15’ (type CrossAttention):
• context=tf.Tensor(shape=(64, 14, 256), dtype=float32)
• target=tf.Tensor(shape=(64, 15, 256), dtype=float32)

I tried to remedy this by only taking target[:,1:,:], as that would skip the first token, which doesn’t give an error in the first quick check, but fails the unit test (error below):

ValueError: Exception encountered when calling layer ‘cross_attention_20’ (type CrossAttention).

Inputs have incompatible shapes. Received shapes (13, 256) and (14, 512)

Call arguments received by layer ‘cross_attention_20’ (type CrossAttention):
• context=tf.Tensor(shape=(64, 14, 512), dtype=float32)
• target=tf.Tensor(shape=(64, 14, 256), dtype=float32)

I’m super confused about how to handle this mismatch and feel like I’m overlooking something simple that I can’t find.

Thanks in advance for whatever support you can provide

I guess this is the call of CrossAttention layer (exercise 2!), in the comments it says:

‘# Call the MH attention by passing in the query and value
# For this case the query should be the translation and the value the encoded sentence to translate
# Hint: Check the call arguments of MultiHeadAttention in the docs’

You are not supposed to add them (target and context) but pass them as arguments to the mh attention layer as mentioned in the comments!

Have a look at the call arguments:

Ah, thanks a lot for the hint! It wasn’t that I was trying to add the target& context, it was that I mixed them up!

1 Like