I have problem with submittion although i follow the hint
There was a problem compiling the code from your notebook. Details:
Exception encountered when calling layer ‘softmax_3’ (type Softmax).
{{function_node _wrapped__AddV2_device/job:localhost/replica:0/task:0/device:CPU:0}} Incompatible shapes: [1,2,2,150] vs. [1,1,1,2] [Op:AddV2] name:
Call arguments received by layer ‘softmax_3’ (type Softmax):
• inputs=tf.Tensor(shape=(1, 2, 2, 150), dtype=float32)
• mask=tf.Tensor(shape=(1, 1, 1, 2), dtype=float32)
Do you know the posting rules, you are not supposed to publish code solutions!
At attention_weights try without the axis parameter!
Thank you and sorry. I read a lot of posts having the same problem, but nobody did not know what is error without code
Cheers
Yes next time ask them to send it in private!
I have fixed like you suggest but it still not works
Send me the entire notebook in private let me have a look in it…
Had anybody done this assignment
I didn’t know how to fix
Try maybe reseting the notebook and redo the entire assignment, sometimes the problems are found going through it again. But keep your current solutions so you can reuse them!
For future learners - the OP’s mistake was in defining the dec_padding_mask.
Note, that when creating padding mask for the decoder’s second attention block - we use the encoder_input. In other words, we inform the decoder to not pay attention to padding tokens of the document to be summarized.
Also note, that this is different from look_ahead_mask (causal mask) where decoder is only allowed to pay attention to itself and its previous tokens.
Cheers
Thank you so much I was getting the same error solved
re: " when creating padding mask for the decoder’s second attention block - we use the encoder_input"
Then in the Transformer class definition in the notebook is the following comment incorrect?
call(self, input_sentence, output_sentence, training, enc_padding_mask, look_ahead_mask, dec_padding_mask):
“”"
Forward pass for the entire Transformer
Arguments:
input_sentence (tf.Tensor): Tensor of shape …
dec_padding_mask (tf.Tensor): Boolean mask for the second multihead attention layer
The comment says that we should use dec_padding_mask for the second mha layer but in reality we should be using the enc_padding_mask - am I understanding you correctly?
Thank you.
I don’t know if you are asking about the same chunk of code, but if so I don’t think your interpretation is correct. I believe that the previous hint that Arvydas is giving there is referring to the next_word function. In that logic you create the dec_padding_mask by calling the create_padding_mask function. And the question is what argument you pass to that function in order to achieve that correctly. At least that’s my reading of it.
Thank you for the prompt reply.
I think I finally get it. In the Transformer Class definition stage it is referred to as the dec_padding_mask for generality as it is intended for use in the 2nd attention layer in the decoder but in the model call in the next_word function this parameter needs to be based on the encoder input.
I hope I am getting it right.
Thank you.
thank you, it actually works. I’ve been tried to fix it :((