I have problem with submittion although i follow the hint
There was a problem compiling the code from your notebook. Details:
Exception encountered when calling layer ‘softmax_3’ (type Softmax).
{{function_node _wrapped__AddV2_device/job:localhost/replica:0/task:0/device:CPU:0}} Incompatible shapes: [1,2,2,150] vs. [1,1,1,2] [Op:AddV2] name:
Call arguments received by layer ‘softmax_3’ (type Softmax):
• inputs=tf.Tensor(shape=(1, 2, 2, 150), dtype=float32)
• mask=tf.Tensor(shape=(1, 1, 1, 2), dtype=float32)
1 Like
Do you know the posting rules, you are not supposed to publish code solutions!
At attention_weights try without the axis parameter!
2 Likes
Thank you and sorry. I read a lot of posts having the same problem, but nobody did not know what is error without code
Cheers
1 Like
Yes next time ask them to send it in private!
I have fixed like you suggest but it still not works
1 Like
Send me the entire notebook in private let me have a look in it…
Had anybody done this assignment I didn’t know how to fix
1 Like
Try maybe reseting the notebook and redo the entire assignment, sometimes the problems are found going through it again. But keep your current solutions so you can reuse them!
1 Like
For future learners - the OP’s mistake was in defining the dec_padding_mask
.
Note, that when creating padding mask for the decoder’s second attention block - we use the encoder_input
. In other words, we inform the decoder to not pay attention to padding tokens of the document to be summarized.
Also note, that this is different from look_ahead_mask
(causal mask) where decoder is only allowed to pay attention to itself and its previous tokens.
Cheers
4 Likes
Thank you so much I was getting the same error solved
3 Likes
re: " when creating padding mask for the decoder’s second attention block - we use the encoder_input
"
Then in the Transformer class definition in the notebook is the following comment incorrect?
call(self, input_sentence, output_sentence, training, enc_padding_mask, look_ahead_mask, dec_padding_mask):
“”"
Forward pass for the entire Transformer
Arguments:
input_sentence (tf.Tensor): Tensor of shape …
dec_padding_mask (tf.Tensor): Boolean mask for the second multihead attention layer
The comment says that we should use dec_padding_mask for the second mha layer but in reality we should be using the enc_padding_mask - am I understanding you correctly?
Thank you.
I don’t know if you are asking about the same chunk of code, but if so I don’t think your interpretation is correct. I believe that the previous hint that Arvydas is giving there is referring to the next_word
function. In that logic you create the dec_padding_mask
by calling the create_padding_mask
function. And the question is what argument you pass to that function in order to achieve that correctly. At least that’s my reading of it.
1 Like
Thank you for the prompt reply.
I think I finally get it. In the Transformer Class definition stage it is referred to as the dec_padding_mask for generality as it is intended for use in the 2nd attention layer in the decoder but in the model call in the next_word function this parameter needs to be based on the encoder input.
I hope I am getting it right.
Thank you.
2 Likes
thank you, it actually works. I’ve been tried to fix it :((