I would like to point to the sentence above the mask in the exercise:
“Just because you’ve worked so hard, we’ll also implement this mask for you . Again, take a close look at the code so you can effictively implement it later.”
def create_look_ahead_mask(sequence_length):
“”"
Returns an upper triangular matrix filled with ones
Don’t remember changing anything. I also, as I just noticed the different dimensions, ran it without the additional dimension in the tf.ones, which gives the same result. Both in the output below the function and later on.
I looked into the public_tests.py and it seems the values the tester is looking for do kinda exist, but are not in the position attn_w_b1[0, 0, 1] is looking for.
print(attn_w_b1[0, 0, 1])=tf.Tensor([0. 0. 1.], shape=(3,), dtype=float32)
while
print(attn_w_b1[0, 0, 0])=tf.Tensor([0. 0.49384946 0.50615054], shape=(3,), dtype=float32)
compare to of the tester [0.5271505, 0.47284946, 0.]
Ok, found it.You are right. Thank you!
In this case the code returns this:
<tf.Tensor: shape=(3, 3), dtype=float32, numpy=
array([[1., 0., 0.],
[1., 1., 0.],
[1., 1., 1.]], dtype=float32)>
Which is a lower triangular matrix and might be the reason I really did change the code. This is different from [Transformer model for language understanding  | Text  | TensorFlow]
God, this is silly.
Ok, on towards the end and hoping for the best, when I actually need to reproduce it somewhere else.
Thank you, TMosh!
Seriously?
I forced a new download yesterday and just did as well. The wrong comment is still there, as is the wrong dim of matrix, according also to the comments. Is the dim correct or not? Who knows?
This is misleading and cost me a lot of time.
The revision process isn’t thorough or the creators didn’t know their maths then. Doubling down is a very strange strategy.