Natural Language Processing with Attention Models C4W1_Assignment Exercise 2

Pf_lute0901 · June 23, 2024, 10:16am

First off, I tried making a week1 tag, but it somehow didn’t seem possible?

I’m confused about the cross attention step in Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera

specifically, because the target is shifted to the right, it has a different dimension than the context/attention output. Just using target + context as inputs to the MHA layer, I get the error:

ValueError: Exception encountered when calling layer ‘cross_attention_15’ (type CrossAttention).

Inputs have incompatible shapes. Received shapes (15, 256) and (14, 256)

Call arguments received by layer ‘cross_attention_15’ (type CrossAttention):
• context=tf.Tensor(shape=(64, 14, 256), dtype=float32)
• target=tf.Tensor(shape=(64, 15, 256), dtype=float32)

I tried to remedy this by only taking target[:,1:,:], as that would skip the first token, which doesn’t give an error in the first quick check, but fails the unit test (error below):

ValueError: Exception encountered when calling layer ‘cross_attention_20’ (type CrossAttention).

Inputs have incompatible shapes. Received shapes (13, 256) and (14, 512)

Call arguments received by layer ‘cross_attention_20’ (type CrossAttention):
• context=tf.Tensor(shape=(64, 14, 512), dtype=float32)
• target=tf.Tensor(shape=(64, 14, 256), dtype=float32)

I’m super confused about how to handle this mismatch and feel like I’m overlooking something simple that I can’t find.

Thanks in advance for whatever support you can provide

gent.spah · June 23, 2024, 11:08am

I guess this is the call of CrossAttention layer (exercise 2!), in the comments it says:

‘# Call the MH attention by passing in the query and value
# For this case the query should be the translation and the value the encoded sentence to translate
# Hint: Check the call arguments of MultiHeadAttention in the docs’

You are not supposed to add them (target and context) but pass them as arguments to the mh attention layer as mentioned in the comments!

Have a look at the call arguments:

Pf_lute0901 · June 23, 2024, 11:53am

Ah, thanks a lot for the hint! It wasn’t that I was trying to add the target& context, it was that I mixed them up!

Topic		Replies	Views
C4W1 - Cross Attention Exercise 2 and 3 NLP with Attention Models week-1	10	543	April 12, 2024
Homework of W1, I get confused at tf.Add NLP with Attention Models week-1	7	25	January 8, 2025
C4W1_Assigment_Exercise 2 - CrossAttention NLP with Attention Models week-1	9	423	January 7, 2024
NLP C4W1 Exercise 3 Decoder NLP with Attention Models week-1	2	451	January 12, 2024
C4W1_Assignment - Translator NLP with Attention Models week-1	2	380	March 20, 2024

Natural Language Processing with Attention Models C4W1_Assignment Exercise 2

Related topics