Course5_week4 Size of attention_weights

fjoucken · June 17, 2021, 11:42pm

Hi,
What should be the size of the tensor attention_weights in the function scaled_dot_product_attention_test?

Should it be (3,1,14) or (3,1,3,4)?
Thanks, I am stuck!

TMosh · June 18, 2021, 12:02am

Probably you should have posted this in Course 5, not Course 4.

I get (3, 4).

fjoucken · June 18, 2021, 3:43am

Yes, you are right, it should be course 5.

I think my problem comes from the function create_padding_mask which creates extra dimension with
return seq[:, tf.newaxis, tf.newaxis, :]
as the last line

I am not sure if I coded this or if it was there already…

Do you have this as well?

TMosh · June 18, 2021, 4:04am

The create_padding_mask() function was provided in the notebook You did not have to modify it.

fjoucken · June 18, 2021, 4:14am

ok, that’s what I thought.
Then I don’t understand where the error for my scaled_dot_product_attention comes from.

It contains the right values but it seems like the shape of the tensor (3,1,3,2) is not correct.

TMosh · June 18, 2021, 4:28am

attention_weights uses tf.keras.activations.softmax(…).

TMosh · June 18, 2021, 4:30am

When you get your code working, please edit your replies that contain your code and delete the code. That clears you with the course Honor Code.

TMosh · June 18, 2021, 4:33am

And I think axis should be dk, not -1. I’m not sure whether -1 works there in all cases.
But, if you do that, dk should not include the square root. So you’d need to modify your code for the scaled_attention_logits.

fjoucken · June 18, 2021, 11:22pm

Thanks.
But it still is not working.
Can you confirm that the shape of the output of scaled_dot_product_attention should be (3,1,3,2) for the test scaled_dot_product_attention_test ?

TMosh · June 19, 2021, 3:06am

No.
I get (3,2) for the output shape.

fjoucken · June 19, 2021, 3:51am

My bad, I was using the mask created before instead of the mask given in that function. My bad.

Topic		Replies	Views
W4 A1 \| Ex-3 \| Scaled Dot Product Attention Sequence Models	27	3208	March 24, 2025
Week 4 A1 problem with scaled_dot_product_attention Sequence Models week-4	6	63	September 6, 2024
C5 W4 A1 E3 help me I don't understand the dimensions of scaled_dot_product_attention Sequence Models week-4	3	267	February 5, 2024
C5W4A1 Exercise 3 - scaled_dot_product_attention Sequence Models	5	1208	July 12, 2021
Week 4: scaled_dot_product_attention Sequence Models	3	904	August 5, 2021

Course5_week4 Size of attention_weights

Related topics