Course5_week4 Size of mask after softmax

fjoucken · June 17, 2021, 11:10pm

I have an issue with the size of the masked sequences.
Why does the softmax changes the shape of the tensor?
Running this:
print(tf.keras.activations.softmax(x).shape)
print(tf.keras.activations.softmax(x + (create_padding_mask(x) * -1.0e9)).shape)
print (x.shape)
print (create_padding_mask(x).shape)

Gives:
(3, 5)
(3, 1, 3, 5)
(3, 5)
(3, 1, 1, 5)

It seems that the dimension (3,1,3,5) is not what I should have and that causes me trouble afterwards.
Thanks!

edwardyu · June 18, 2021, 9:55am

softmax won’t change shape, but (x + create_padding_mask(x) * -1.0e9) will do.
x shape: (3, 5)
mask shape: (3, 1, 1, 5)
summation shape: (3, 1, 3, 5)
Please refer to broadcasting rule.

fjoucken · June 18, 2021, 11:42pm

Thank you Edward.
But then I have the wrong shape for the output of my scaled_dot_product_attention function.
For the test scaled_dot_product_attention_test, the output has shape (3, 1, 3, 2) but it looks that it should (3,2).
Could you help me with that?
Thanks

edwardyu · June 19, 2021, 12:03am

scaled_dot_product_attention function provides mask for you, you should not use create_padding_mask to generate your own mask.

fjoucken · June 19, 2021, 3:50am

Thanks a lot! Damn, I was dumb!

MuskaanManocha · June 20, 2021, 2:16pm

omg I did the same thing!!! Thanks @edwardyu mentor for pointing this out.

Topic		Replies	Views
C5w4 2.1 Padding mask Sequence Models week-4 , coursera-platform	9	290	March 9, 2024
Why does applying the padding mask change the tensor's shape C5W4Asn1 Sequence Models coursera-platform	2	555	January 21, 2023
C5_W4 Masking issue (?!) Sequence Models week-4 , coursera-platform	2	136	May 16, 2024
Create_padding_mask() function Sequence Models week-4 , coursera-platform	3	30	August 16, 2024
C5 W4 A1: Why is create_padding_mask adding more dimensions than its supposed to? Sequence Models coursera-platform	1	473	May 31, 2023

Course5_week4 Size of mask after softmax

Related topics