I knew that, and I can’t figure it out that’s why i wrote this post, could you please help me specify what’s wrong?
i did use the following:
(mask * -1e9)
tf.keras.activations.softmax(scaled_attention_logits, axis=-1)
Appreciate your assistance
thanks, that’s helpful & solve the issue
may i ask why we changed the value than the recommended one in the library
is it something based on trial & error or based on some analysis that i should know about?
I didn’t get this hint. I don’t think it’s necessary for the 1 to be a float because the term is multiplied by the float -1e9. I’m not criticizing, only giving feedback.
where are we supposed to use (1-mask)*-1e9 in code? I tried it while adding mask to the scaled_attention_logits, but it is showing me the ‘wrong masked weights’ error. Could you please help me with this issue?