Hi,
My problem is similar, except my error is âsoftmax_404â.
My grade function next_word gives the expected output [[14859]] and masses and it passes the tests and as a part of this:
w2_unittest.test_next_word(next_word, transformer, encoder_input, output)
" All tests passed!"
But when I try to summarize a sentence, this line:
summarize(transformer, document[training_set_example])
references this code from DecoderLayer.call:
mult_attn_out2, attn_weights_block2 = self.mha2(⌠)
which throws the following error:
InvalidArgumentError: Exception encountered when calling layer âsoftmax_404â (type Softmax).
{{function_node _wrapped__AddV2_device/job:localhost/replica:0/task:0/device:GPU:0}} required broadcastable shapes [Op:AddV2] name:
Call arguments received by layer âsoftmax_404â (type Softmax):
⢠inputs=tf.Tensor(shape=(1, 2, 2, 150), dtype=float32)
⢠mask=tf.Tensor(shape=(1, 1, 1, 2), dtype=float32)
I printed these from next_word:
enc_padding_mask: tf.Tensor(
[[[1. 1. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]], shape=(1, 1, 150), dtype=float32)
look_ahead_mask: tf.Tensor([[[1.]]], shape=(1, 1, 1), dtype=float32)
dec_padding_mask: tf.Tensor([[[1.]]], shape=(1, 1, 1), dtype=float32)
output: tf.Tensor([[7]], shape=(1, 1), dtype=int32)
Predicted token: [[14859]]
Predicted word: masses
Any suggestions would be greatly appreciated. Thanks.
John