C4W1 - Cross Attention Exercise 2 and 3

Getting stuck with Cross_attention function.
I understand I cannot share much info but let me share at least the output I get versus what I should.
Honestly no idea what’s happening and if we can’t share any code here, how can get any feedback and make any progress? So far, quite disappointed by documentation and the videos

Tensor of contexts has shape: (64, 15, 256) Tensor of translations has shape: (64, 14, 256) Tensor of attention scores has shape: (64, 14, 256)

Expected Output
Tensor of contexts has shape: (64, 14, 256)
Tensor of translations has shape: (64, 15, 256)
Tensor of attention scores has shape: (64, 15, 256)

and in exercise 3, getting those errors:


AttributeError Traceback (most recent call last)
Cell In[66], line 3
1 # Test your code!
----> 3 w1_unittest.test_decoder(Decoder, CrossAttention)

File /tf/w1_unittest.py:403, in test_decoder(decoder_to_test, CrossAttention)
400 cases.append(t)
401 return cases
→ 403 cases = g()
404 print_feedback(cases)

File /tf/w1_unittest.py:316, in test_decoder..g()
313 cases.append(t)
315 t = test_case()
→ 316 if not isinstance(decoder.attention, CrossAttention):
317 t.failed = True
318 t.msg = “Incorrect type of attention layer”

AttributeError: ‘Decoder’ object has no attribute ‘attention’

Hi @Benvdv

You’re probably mixing up context with target. Check for that first.

Cheers

1 Like

Hi @Benvdv,

Welcome to our community, and I’m sorry to hear about the challenges you’re facing. Please be assured that our course mentors are dedicated to assisting you and will provide feedback on your queries, as one has responded to you already. Our community upholds the principles of effective learning and maintaining academic integrity, as outlined in our guidelines. In cases where mentors require a closer look at your code to offer more tailored assistance, they will reach out to you directly via private message.

1 Like

As for Exercise 3, it indicates that your Decoder implementation is missing self.attention which you should have implemented in the __init__ part of code.

Cheers

Thanks. But honestly I don’t know how to answer to suggestions I received given I cannot share code, so still stuck

If you could tell me what Tensor Flow function I should call.

In my init I use LSTM calls and CrossAttention call, I don’t know of any tensorflow self.attention

Between your LSTM “calls” (they are not actually “calls”, they are instances that are saved when you initialize the Decoder class) there is your “CrossAttention call” which you have to implement:

...

        # The attention layer
        self.attention = None(None)
...

In other words, you should have saved your CrossAttention “call” (instance) in the variable self.attention and what the unit test tells you is that it cannot find it.

Cheers

1 Like

I got thru the assignment but am confused that per the instructions: “query should be the translation and the value the encoded sentence to translate”

I would have thought that it would be the other way around given the example in the lectures. Would appreciate any color/clarification. Thank you.

No, you must have misunderstood the example (which one in particular) in the lectures.

I explained it previously in detail in this thread or maybe you want a more formal version.

But the short version in simple words is:

  • query is what you’re looking for (like: “Hey, I have this translation up until this point, which token should come next?”)
  • key is what could be a match (like: “Hey, these are the tokens in the original sentence, let’s see which align best.”)
  • value is what meaning to carry for translation (like: “OK, let’s take the best aligned candidates, sum their meanings and pass it for further token prediction”).

So, in similar vein, “the query should be the translation and the value the encoded sentence to translate”.

Cheers

1 Like

Thank you so much for your great explanation and very helpful additional pointers - and prompt response

Very helpful as always - deeply appreciated.

Thanks, again.

1 Like