print(tf.keras.activations.softmax(x))
print(tf.keras.activations.softmax(x + (1 - create_padding_mask(x)) * -1.0e9))
should be
print(tf.keras.activations.softmax(x))
print(tf.keras.activations.softmax(x + (1 - tf.squeeze(create_padding_mask(x), axis=1)) * -1.0e9))
in order to match the size of x.