yes,
this error assert tf.is_tensor(attention), "Output must be a tensor"
using dk as dk= tf.cast(tf.shape(k)[0],tf.float32)
mask as mask tf.linalg.band_part(tf.ones((tf.shape(q)[0],tf.shape(k)[0])), -1, 0)
and in if condition scaled_attention_logits += ((1-mask)*(-1e9))
tf.keras.activations.softmax
output as tf.matmul
can you plz correct this
I was having the same issues.
Finally changing from tf.nn.softmax to tf.keras.activations.softmax solved the issue.
Not sure what is the difference, but just in case it helps someone
I am getting a wrong shape error:
—> 59 assert tuple(tf.shape(weights).numpy()) == (q.shape[0], k.shape[1]), f"Wrong shape. We expected ({q.shape[0]}, {k.shape[1]})"
60 assert np.allclose(weights, [[0.2589478, 0.42693272, 0.15705977, 0.15705977],
61 [0.2772748, 0.2772748, 0.2772748, 0.16817567],
AssertionError: Wrong shape. We expected (3, 4)
Since I can add the mask to scaled_attention_logits, I don’t think that is the wrong shape but since it is saying it should be q.shape[0],k.shape[1] maybe it is. I did transpose k for the first matrix multiplication. It doesn’t actually say where the problem is occurring in the function. Any ideas what I could be doing wrong?
This thread has been cold for four years. Posting here was a bold strategy. I see you also created a new thread, and the conversation is continuing there.