Week 4 Encoder Layer

Sanketgarg · June 17, 2021, 6:19pm

Hi,
I am having a problem in understanding the code for the encoder layer. In order to compute the self attention, it is mentioned 1. You will pass the Q, V, K matrices and a boolean mask to a multi-head attention layer. Remember that to compute self -attention Q, V and K should be the same." How do I calculate Q, V and K?

TMosh · June 17, 2021, 11:12pm

General tips for this function:

For self-attention, Q, K, and V are all the same - the ‘x’ variable. You need to use it three times.
‘mask’ is provided as a function parameter, you need to pass that also.
For dropout1, you need to also pass “training=training”.
For out2, you need to use out1. not attn_output.

Also, please edit your message to remove the code. Posting your code isn’t allowed by the course Honor Code.

TMosh · August 9, 2021, 4:54am

Note: this post may no longer be correct, since the assignment has been modified recently.

Topic		Replies	Views
DLS Course 5 Week 4 Exercise 4 Sequence Models	2	710	June 29, 2021
Course 5 - Week 4 - A1 - Exercise 4 - EncoderLayer Sequence Models week-4	2	31	August 13, 2024
C5_W4_A1_Transformer_Subclass_v1 Exercise 4 Sequence Models	3	402	September 26, 2023
Q,K,V all are same for self attention Sequence Models	5	621	November 19, 2023
Question about multi-head attention Sequence Models	2	619	June 25, 2021

Week 4 Encoder Layer

Related topics