C5_W4 Masking issue (?!)

Brusk · May 15, 2024, 3:15pm

In the last piece of code in the section 2.1 - Padding Mask, we have an example of how the softmax is diferent aplying or not the padding mask.

print(tf.keras.activations.softmax(x))
print(tf.keras.activations.softmax(x + (1 - create_padding_mask(x)) * -1.0e9))

However, the broadcasting doesn’t make sense to me. I modified the code a little to make my point more clear:

print(x)
y = (1 - create_padding_mask(x)) * -100
print(y)
z=x+y
print(z)

Here is the ouput:

tf.Tensor(
[[7. 6. 0. 0. 1.]
[1. 2. 3. 0. 0.]
[0. 0. 0. 4. 5.]], shape=(3, 5), dtype=float32)

tf.Tensor(
[[[ -0. -0. -100. -100. -0.]]
[[ -0. -0. -0. -100. -100.]]
[[-100. -100. -100. -0. -0.]]], shape=(3, 1, 5), dtype=float32)

tf.Tensor(
[[[ 7. 6. -100. -100. 1.]
[ 1. 2. -97. -100. 0.]
[ 0. 0. -100. -96. 5.]]

[[ 7. 6. 0. -100. -99.]
[ 1. 2. 3. -100. -100.]
[ 0. 0. 0. -96. -95.]]

[[ -93. -94. -100. 0. 1.]
[ -99. -98. -97. 0. 0.]
[-100. -100. -100. 4. 5.]]], shape=(3, 3, 5), dtype=float32)

I dont really see why it is broadcasted to (3,3,5) instead of (3,1,5).
The only values that make sense to me in ‘z’ are:
-First row of the first (3,3): [ 7. 6. -100. -100. 1.]
-Sencon row of the second (3,3): [ 1. 2. 3. -100. -100.]
-Third row of the third (3,3): [-100. -100. -100. 4. 5.]

Therefore, the (3,1,5) z would be this:

tf.Tensor(
[[[ 7. 6. -100. -100. 1.]]
[[ 1. 2. 3. -100. -100.]]
[[-100. -100. -100. -0. -0.]]], shape=(3, 1, 5), dtype=float32)

Am I wrong?

Brusk · May 15, 2024, 4:39pm

I other words, instead of this:

print(tf.keras.activations.softmax(x))
print(tf.keras.activations.softmax(x + (1 - create_padding_mask(x)) * -1.0e9))

this:

print(tf.keras.activations.softmax(x))
x2 = tf.expand_dims(x, axis=1)
print(tf.keras.activations.softmax(x2 + (1 - create_padding_mask(x)) * -1.0e9))

?

balaji.ambresh · May 16, 2024, 1:56am

Could you please send me an annotated notebook with your points?

Topic		Replies	Views
C5w4 2.1 Padding mask Sequence Models week-module-4 , coursera-platform	9	290	March 9, 2024
Create_padding_mask() function Sequence Models week-module-4 , coursera-platform	3	30	August 16, 2024
C5-W4-A1 Revision Suggestion Sequence Models coursera-platform	1	515	January 18, 2023
Why does applying the padding mask change the tensor's shape C5W4Asn1 Sequence Models coursera-platform	2	555	January 21, 2023
Course5_week4 Size of mask after softmax Sequence Models coursera-platform	6	688	March 15, 2025

C5_W4 Masking issue (?!)

Related topics