C4W1 - Cross Attention Exercise 2 and 3

Benvdv · January 24, 2024, 9:52pm

Getting stuck with Cross_attention function.
I understand I cannot share much info but let me share at least the output I get versus what I should.
Honestly no idea what’s happening and if we can’t share any code here, how can get any feedback and make any progress? So far, quite disappointed by documentation and the videos

Tensor of contexts has shape: (64, 15, 256) Tensor of translations has shape: (64, 14, 256) Tensor of attention scores has shape: (64, 14, 256)

Expected Output

Tensor of contexts has shape: (64, 14, 256)
Tensor of translations has shape: (64, 15, 256)
Tensor of attention scores has shape: (64, 15, 256)

Benvdv · January 24, 2024, 10:19pm

and in exercise 3, getting those errors:

AttributeError Traceback (most recent call last)
Cell In[66], line 3
1 # Test your code!
----> 3 w1_unittest.test_decoder(Decoder, CrossAttention)

File /tf/w1_unittest.py:403, in test_decoder(decoder_to_test, CrossAttention)
400 cases.append(t)
401 return cases
→ 403 cases = g()
404 print_feedback(cases)

File /tf/w1_unittest.py:316, in test_decoder..g()
313 cases.append(t)
315 t = test_case()
→ 316 if not isinstance(decoder.attention, CrossAttention):
317 t.failed = True
318 t.msg = “Incorrect type of attention layer”

AttributeError: ‘Decoder’ object has no attribute ‘attention’

arvyzukai · January 25, 2024, 5:49am

Hi @Benvdv

You’re probably mixing up context with target. Check for that first.

Cheers

lukmanaj · January 25, 2024, 5:52am

Hi @Benvdv,

Welcome to our community, and I’m sorry to hear about the challenges you’re facing. Please be assured that our course mentors are dedicated to assisting you and will provide feedback on your queries, as one has responded to you already. Our community upholds the principles of effective learning and maintaining academic integrity, as outlined in our guidelines. In cases where mentors require a closer look at your code to offer more tailored assistance, they will reach out to you directly via private message.

arvyzukai · January 25, 2024, 5:56am

As for Exercise 3, it indicates that your Decoder implementation is missing self.attention which you should have implemented in the __init__ part of code.

Cheers

Benvdv · January 25, 2024, 8:47am

Thanks. But honestly I don’t know how to answer to suggestions I received given I cannot share code, so still stuck

Benvdv · January 25, 2024, 8:49am

If you could tell me what Tensor Flow function I should call.

In my init I use LSTM calls and CrossAttention call, I don’t know of any tensorflow self.attention

arvyzukai · January 25, 2024, 2:11pm

Between your LSTM “calls” (they are not actually “calls”, they are instances that are saved when you initialize the Decoder class) there is your “CrossAttention call” which you have to implement:

...

        # The attention layer
        self.attention = None(None)
...

In other words, you should have saved your CrossAttention “call” (instance) in the variable self.attention and what the unit test tells you is that it cannot find it.

Cheers

Cawnpore_Charlie · April 12, 2024, 6:11am

I got thru the assignment but am confused that per the instructions: “query should be the translation and the value the encoded sentence to translate”

I would have thought that it would be the other way around given the example in the lectures. Would appreciate any color/clarification. Thank you.

arvyzukai · April 12, 2024, 11:12am

No, you must have misunderstood the example (which one in particular) in the lectures.

I explained it previously in detail in this thread or maybe you want a more formal version.

But the short version in simple words is:

query is what you’re looking for (like: “Hey, I have this translation up until this point, which token should come next?”)
key is what could be a match (like: “Hey, these are the tokens in the original sentence, let’s see which align best.”)
value is what meaning to carry for translation (like: “OK, let’s take the best aligned candidates, sum their meanings and pass it for further token prediction”).

So, in similar vein, “the query should be the translation and the value the encoded sentence to translate”.

Cheers

Cawnpore_Charlie · April 12, 2024, 6:19pm

Thank you so much for your great explanation and very helpful additional pointers - and prompt response

Very helpful as always - deeply appreciated.

Thanks, again.

Topic		Replies	Views
Natural Language Processing with Attention Models C4W1_Assignment Exercise 2 NLP with Attention Models week-1	2	131	June 23, 2024
NLP C4W1 Exercise 3 Decoder NLP with Attention Models week-1	2	452	January 12, 2024
Uable to pass w1_unittest.test_decoder(Decoder, CrossAttention) NLP with Attention Models week-1	2	341	March 3, 2024
C4W1 Assignment - Exercise 3 Decoder Function NLP with Attention Models week-1	6	357	May 24, 2024
NMT with Attention Model NLP with Attention Models	2	390	January 2, 2024

C4W1 - Cross Attention Exercise 2 and 3

Expected Output

Related topics