UNQ_C9: Model input

arosenbloom · May 22, 2023, 2:46am

The hint in the next_symbol() says:

model expects a tuple containing two padded tensors (with batch)

I don’t understand: what parameters or properties of the model tell us that the model expects a tuple containing two padded tensors with batch? Can someone please explain? In general, beyond the model structure, is there a way to investigate and debug the inner workings of the model? I wish the course spent some time on that.
Also, is my assumption correct that the symbols corresponding to the output sequence preceding the next symbol should mirror the input sequence?

arvyzukai · May 23, 2023, 5:17am

That is a good question Generally, trax training expects a tuple of (input, output) or (input, output, weights). The details are somewhat a lot to explain in single post. Unfortunately the trax TrainTask documentation is not that good on details, the best way is to look inside the code and that could be liberating or challenging depending on one’s programming knowledge.

You can debug notebooks by adding %%debug at the top of the cell, but here you also need a bit of knowledge. Mostly you use n for next line, s when you want to get into the function (for example, trax uses .forward() and pure_fn() methods (it’s where you would want to “step in”), and exit when you want to exit the debugger.

The model outputs probabilities for the whole sequence (including symbols up to the “next_symbol”, the “next_symbol” and symbols after it) and it should assign highest or at least very high probability for the “up to the next_symbol” symbols. In other words, the loss is calculated for the whole sequence and it better be that the probabilities are highest (or very high) where the targets are.

Cheers

arosenbloom · May 23, 2023, 7:18pm

Hi, @arvyzukai

Thanks for sharing the tips about using %%debug. I strongly recommend adding a short video demo on its use at the beginning of the course.

Secondly, given this:

… why does the hint in UNQ_C9 suggest that to obtain the log probability for the next symbol, we should look for it at the token_length position of the log_probs array? Shouldn’t it be at least at token_length+1 (but probably further away given the EOS and the SEP characters)?

Thanks!

arvyzukai · May 24, 2023, 4:42am

Hi @arosenbloom

I’m inclined to think that the course is Natural Language Specialization and it should not cover a lot of topics, including this. But I would agree with you that most of the learners using notebooks should be familiar with magic commands and also have some basic debugging knowledge (especially what to do after receiving an error).

Because the way indexing works - the token_length value is the index we want. For example, if we have generated “I love learning” and in this case the token_length would be of value 3 (3 words), then the index for the next_token is also 3. In other words, value at index 3 of [7, 8, 9, 10] is value 10.

Cheers

arosenbloom · May 24, 2023, 5:17am

Silly of me! Of course! Thanks!

David_C1 · June 26, 2023, 7:07pm

Thanks for discussing this problem! I recently also has this problem with determining the parameters for the input. In this case , I believe in training process, the model actually takes three parameters: input, output and mask, as previously defined in the stream generator. Why the mask is neglected at the later stage?

arvyzukai · June 27, 2023, 6:41am

Hi @David_C1

It’s been a while since I looked at the code deeper but I don’t thing the training.Loop neglects the training_mask during training in this case.
If I remember correctly, the ~~training~~ model in this assignment takes two identical input streams of data (both with (input, output and mask)).
Digging deep into training.Loop code is a bit of a challenge in trax and you helped me remember that someone asked the similar question and had no time to look into it thoroughly. So if you’re doing a deep dive into this, please share your conclusions

Cheers

Topic		Replies	Views
Calling the model in UNQ_C9 : next_symbol NLP with Attention Models week-2	8	354	September 19, 2023
Why is the input a tuple? NLP with Attention Models week-2	2	574	May 3, 2022
Two Model Inputs for Assignment 2? NLP with Attention Models week-2	3	529	August 3, 2022
Trax _in#_out# in UNQ_C5 in NLP C4 W2 NLP with Attention Models week-2	6	693	September 19, 2022
How to determine the input format of tensor for the evaluation/prediction mode in Trax NLP Resources	6	164	June 27, 2023

UNQ_C9: Model input

model expects a tuple containing two padded tensors (with batch)

Related topics