Calling the model in UNQ_C9 : next_symbol

Ritu_Pande · September 16, 2023, 4:35am

I don’t understand why the model is called as a tuple? From previous assignments this is usually done when tl.Parallel is used. But in the model architecture I do not find the necessity to provide a tuple. Maybe I am missing something basic. Can someone please help me understand this?

gent.spah · September 16, 2023, 3:12pm

I believe in this one (although I could not trace exaclty the starting point) because there are many functions linked to the final model, the tokenized input is served as padded tensors going from the beginning of the lab till the creation of the TransformerLM.

Ritu_Pande · September 16, 2023, 6:59pm

Sorry, did not understand. What is the necessity of providing the same input twice to the model as a tuple ?

gent.spah · September 17, 2023, 6:53am

It should be because of QKV multiplication, one is masked the other is not, 2 copies of the input are passed to the model.

arvyzukai · September 19, 2023, 9:42am

Hi.

I don’t think you are correct on this one @gent.spah - the elements in the tuple are identical and have nothing to do with QKV (unless I don’t understand something?).

And in general, it is a good question @Ritu_Pande why the instructions ask to input the tuple, because the model would work just fine with:
output = model(padded_with_batch)
Maybe it is a remnant of some code that used the second output as a target… I don’t know …

In any case, during inference (the next_symbol function) the model never “touches” the second input (the second input could be changed to np.zeros_like(padded_with_batch) and it would not change the output of the model - the _ variable would be zeroes).

I will report it for further investigation.

Cheers

gent.spah · September 19, 2023, 10:01am

Sometimes I am also wrong @arvyzukai and that might be the case, was not fully convinced thats why I said I believe. If anybody finds the right answer let us know!

arvyzukai · September 19, 2023, 10:09am

Of course, @gent.spah, respect for admitting of being wrong and everyone is wrong at some point I might be wrong here too so that is why I asked for further comment.

Cheers

gent.spah · September 19, 2023, 12:12pm

No worries @arvyzukai its good to be wrong too

arvyzukai · September 19, 2023, 12:19pm

Yap @gent.spah and the best outcome of this thread would be if we’re both wrong. That would mean that we can learn something and correct our wrong model (no pun intended) of the world

Topic		Replies	Views
Why is the input a tuple? NLP with Attention Models week-2	2	574	May 3, 2022
Two Model Inputs for Assignment 2? NLP with Attention Models week-2	3	529	August 3, 2022
UNQ_C9: Model input NLP with Attention Models week-2	6	519	June 27, 2023
Where can I find the reference on what input should be passed to the model in the evaluation step? NLP with Sequence Models week-4	5	542	November 17, 2022
C4W1 next_symbol function not working NLP with Attention Models week-1	15	914	January 18, 2023

Calling the model in UNQ_C9 : next_symbol

Related topics