C5_W4_A1_Ex-4_EncoderLayer

Dalkhat · May 20, 2021, 11:46am

Hi!
I have a terrible experience with the last programming assignment (probably, because I am a newbie to both py, tf, and Keras). The descriptions and hints are very unhelpful.
The particular problem I have right now: running the test for the EncoderLayer, I get ‘call() missing 1 required positional argument: ‘value’’ in relation to the attn_output = … line /Where, btw, I put the ‘attention_mask=mask’ after the input by intuition only… - definitely a wrong way to go, but I have no ideas, how to handle this/ Please, help

Dyxuki · May 20, 2021, 4:01pm

Hi,
to use the Multi-Head-Attention layer, you have to specify query, value, and key.
In the documentation (tf.keras.layers.MultiHeadAttention | TensorFlow Core v2.5.0) you can read: If query , key, value are the same, then this is self-attention.
This is simply to say for a given t: q=Wq.x, k=Wk.x, v=Wv.x, with the same input x

So basically, you have to pass query=x, value=x, and key=x (optional, if missing then key = value by default) to the argument of self.mha

Dalkhat · May 20, 2021, 8:17pm

Thank you, Dyxuki! That helped.
Now I am at full Encoder() where I get ‘‘ListWrapper’ object is not callable’ in relation to the call of self.enc_layers(I put here x, training and mask). Any hints? Btw, could you explain why do we have the loop over range(self.num_layers) in both the self.enc_layers and in the main code? Thank you!

Dyxuki · May 20, 2021, 9:25pm

well the error says it all
self.enc_layers is a list, so instead you should take the elements in it.

the self.num_layers is simply to say you have that many EncoderLayer.
namely, you can see: “self.enc_layers” as [ EncoderLayer, EncoderLayer, EncoderLayer, …, EncoderLayer] , where the len is self.num_layers.

Dalkhat · May 21, 2021, 5:44am

Thank you Dyxuki, clear and (now) simple. Now everything works, except I get ‘Wrong values’…fixed… Now struggle with ‘Wrong values in outd’ at Decoder…

Dyxuki · May 21, 2021, 8:51am

You are welcome !
apparently the wrong outd value pb is a bug in the assignment sheet, not yet fixed, I got it also when doing the assignment. The thread is here:
https://community.deeplearning.ai/t/week-4-assignment-1-decoder-class-error

Dalkhat · May 21, 2021, 9:50am

Thanks for the tip. After some struggling, following that thread, managed the course!

santoshsastry · May 30, 2021, 1:26am

Hi Dalkhat -
I am running into an issue with Course5 Week4 Exercise5 Full Encoder (UNQ_C5) step. I am running into some issues with adding the position encoding to the embedding and not quite sure how to resolve the error I get in the log.

Add the position encoding: self.pos_encoding [:, :seq_len, :] to your embedding.

I am passing in x[:, :seq_len, :] to x from the previous step, but this throws an error saying the object is not callable. Can you help point me in the right direction?

Thank you very much.
/santosh

santoshsastry · May 30, 2021, 1:36am

Never mind, I figured out that I should not be passing in the slices for x.

Thanks!

Damon · June 20, 2021, 6:36am

Hi @Dyxuki ,

Given: q=Wq.x, k=Wk.x, v=Wv.x, with the same input x

If we conclude: query=x, value=x, and key=x, then what’s going on about Wq, Wk, Wv ?

Dyxuki · June 20, 2021, 9:31am

Hello,
it’s just the notation that are a little confusing.
the arguments “query”, “value” and “key” should be seen as the input vector to compute them respectively.
so namely, the argument “query” is not “q” (notation from the course), but rather something like:
q (actual) = Wq.“query (input argument)”

Wq Wk Wv are weights of the layer that will be learnt

Damon · June 23, 2021, 7:06am

Hi @Dyxuki, I tried to view the “query”, “value” and “key” as 3 different aspects of input x that providing richer representation of x’s features, and not to bind with the notations.

It works well for me.
Thank you for your hint.

LuBinLiu · August 20, 2021, 5:01am

Hi @Dyxuki

I’m also confused about this, how does query, value, key from the call method relate to the variables from the lecture here, for the case where all 3 are equal (self attention) and when they are all different?

Self attention:
capture4

Multi-head attention:

Regarding the weights, how are Wq, Wk, Wv related to W_i^{<Q>}, W_i^{<K>} W_i^{<V>} (multi-head attention) and W^{<Q>}, W^{<K>} W^{<V>} (self attention)from the slide above?

TMosh · August 20, 2021, 5:39am

Q K and V are the Query, Key, and Value.

mc04xkf · September 11, 2021, 8:04am

since Wq, Wk, Wv are parameters to learn, if we pass query=x, value=x, key=x, does it mean we effectively initialize Wq, Wk, Wv to 1 (instead of initializing randomly as we did in other algorithms) ?

@Damon @Dyxuki

@GordonRobinson
@Kic
@edwardyu
@laacdm

Bunny · September 12, 2021, 8:03am

UNQ_C4, How do I apply dropout only during training in 1 line of code? How do I have to use the training variable for this?

Also, Im getting the error, 'The first argument to Layer.call must always be passed." while applying self.layernorm1(tf.math.add(x, atta_output))

TMosh · September 13, 2021, 12:34am

You can use a single line of code because the constructor for this class has a method - dropout_ffn() - that does exactly what you need.

Topic		Replies	Views
Programming Assignment: Transformers Architecture with TensorFlow encoderlayer Sequence Models week-4	2	395	January 23, 2024
C5W4A1 - UNQC4 - EncoderLayer - ValueError: The first argument to `Layer.call` must always be passed Sequence Models	4	572	April 3, 2023
C5 W4: Exercise 4 EncoderLayer() At least need to know Sequence Models	13	638	December 3, 2022
W4 A 1 \| Ex- 4 \| Encoder Layer Sequence Models	15	2335	May 30, 2023
C5W4A1: Excercise: 4 EncoderLayer: How to Read the Tensor Flow Documentation for MultiHeadAttention Sequence Models	1	552	September 28, 2022

C5_W4_A1_Ex-4_EncoderLayer

Related topics