Dimension of weight matrices

Anivader · December 5, 2022, 4:01pm

Hello,

In C4_W1_Ungraded_Lab_1_Basic_Attention, it says the dimensions of W_a, U_a are (n x m), where n: hidden state size and m: layer size in alignment network.

I’m confused as to what “n” and “m” are exactly. Could you please explain with the help of an actual example ?

Thanks
Ani

arvyzukai · December 5, 2022, 5:12pm

Hi @Anivader

I’ve reproduced this lab in an Excel sheet, when I was learning. I can share it if it’ll help:

Here are the inputs in the aligment function:

image2122×151 17.1 KB

as in this part:

Then there is a Linear transformation:
here are the weights (Note: that they’re already transposed for convenience (as in the code) - (2n, m) or (2*hidden_size, attention_size), (2*16, 10) . Here they are stacked version of W_a and U_a. (because W_a \cdot s_{i-1} + U_a \cdot h_j is equivalent to W_a | U_a \cdot s_{i-1}| h_j)

image721×687 30.3 KB

when you dot product these matrices you get:

as in here:

Then you apply tanh and get activations:

image713×151 11.7 KB

as in here:

Then comes the second layer v_a. Here are the weights:

When you dot product activations with this layer’s weights, you get “alignment scores”:

as in here:

Just to complete the lab here are the remaining calculations
5. Then you apply softmax to get the “attention weights” (variable weights in the attention function):

Then you just multiply (Hadamard product) “Encoder states” with these “attention weights” to get the weighted encoder states (weighted_scores in the attention function):

image1092×128 7.11 KB
Lastly you sum these encoder states along axis 0 to get the “context”:

image1102×51 5.31 KB

Cheers

Topic		Replies	Views
C4_W1_Ungraded_Lab_1: Dimension question NLP with Attention Models week-1	1	506	November 2, 2022
Dimension of Weight Matrix Neural Networks and Deep Learning	7	1534	November 25, 2022
Understanding of basic Attention code NLP with Attention Models week-1	3	549	August 13, 2023
RNN Model Wa dimension Sequence Models	1	529	August 20, 2022
Matrix size for every step NLP with Attention Models week-1	4	602	August 1, 2023

Dimension of weight matrices

Related topics