W4_Ex-4 | (Encoder) - Stuck with no clue what to do next! Please help!

Hi there -

I am completely stuck at the Course5 Week4 Exercise 4 (UNQ_C4) step. I have spent a lot of time reading through Keras documentation, going through the function step by step, trying several combinations based on the instructions provided etc. but I have no idea how to get the call() function to work.

At this point, I am completely frustrated and am close to calling it quits. I’ve managed to work through all prior assignments from Course 1 through Course 5 week3 on my own (after a bit of struggling in some cases), but I feel like I’ve hit an impenetrable wall this time around.

I don’t want to post my code as it likely goes against the Honor Code, but I think my issue is with the initial call to the self.mha() layer

START CODE HERE

    # calculate self-attention using mha(~1 line)
    attn_output = self.mha()  # Self attention (batch_size, input_seq_len, fully_connected_dim)

Can anyone point me in the right direction?

/santosh

2 Likes

Lots of students have difficulty with this assignment - it isn’t very well-written.

Have you tried searching on the forum here for posts from other students? There has been a lot of discussion about it.

/// Update 11/2022 ///
For self attention, you call self.mha(…) with x for all three K, Q, and V arguments, and also pass the mask. This is discussed in Sections 3, 4, and 4.1.

1 Like

Thanks @TMosh! I tried looking through some of the existing threads but couldn’t find anything that was immediately helpful to get me unstuck.
I can give it another go (after my brain decompresses a bit), but am not sure how this will help! Would appreciate other suggestions!

Thanks,
/santosh

@santoshsastry I’m stuck in there too. This assignment is by far the most challenging… and it lacks some of the typical hints in the code to help a little more
So far it helped me a lot reading the Transformer documentation.

1 Like

@kleber - yeah, it is challenging and the notebook documentation is not very helpful. I had to spend a ton of time reviewing the TF/keras documentation and looking through examples to understand how to proceed further. I managed to complete the course successfully, but it took a lot more effort and time than I had anticipated.

Best of luck!
/santosh

I also managed to complete it. But I must say this last Week’s topic is incredibly complex… Reading the Transformer/Encoder/Decoder documentation is absolutely essential for this assignment. I think it could be better designed so the student could absorb the concepts more concretely. I’ll think about a good feedback on it.

I am also stuck here. Can anyone of you who completed help me out…

You can try searching documents on tensorflow documents: Text  |  TensorFlow

You can try searching documents on tensorflow documents: Text | TensorFlow

Hi @TMosh
I got this error and I am stuck here for a long time. I read a blog suggests to use mask in the attention layer but I don’t know how, Could u please help me?

In EncoderLayer(), the only use of “mask” is with self.mha().
The only use of “training=training” is with self.dropout_ffn().

3 Likes

Thank u so much!! :heart:
I solved this problem

Didn’t solve the problem for me. Variable names are awful. Whoever wrote this needs to look at quality of other programming exercises and learn

2 Likes

Your answer saved my day. But why is the usage from mask different from the usage of training, as a parameter?
I don’t want to post the code here, but this is very strange for me.

The masking is discussed in Sections 2.1 and 2.2.