Token_list = tokenizer.texts_to_sequences([line])[0]

Hi everyone,

Can someone help me by explaining what that [0] at the end is doing?
I don’t understand. Maybe I missed something in the lecture
Thank you in advance

Hey Pham!

If you follow along with the notebook, you might see this line:

Notice that you received the sequence wrapped inside a list so in order to get only the desired sequence you need to explicitly get the first item in the list like this:

This is basically saying that: Because tokenizer.texts_to_sequences([corpus[0]]) returns the sequence wrapped in a list (which is saying: [[1,2,3,4,5]], we need to extract the sequence, which is just the 0th element in our list!

Hopefully this explanation is clear! Please reach out if you have any further questions! :smile:

1 Like

oh, sorry, I should have read it carefully.
Thank you for your explanation, it is super clear

1 Like

Can you please clarify what you meant? In the Lab 2 the code snippet in question is written as tokenizer.texts_to_sequences([line])[0], not as you put it tokenizer.texts_to_sequences([corpus**[0]**]). According to the Tensorflow API docs tokenizer.texts_to_sequences() method returns a list of sequences. My understanding of the API docs was that it should return a list of lists of tokenized sentences. However, when I printed it out what it returns I got:

  • when the first element indexed (i.e. extracted with [0]) a bunch of tokenized sentences each packaged in its own list or, in other words, 1D vectors of shape (n,);

  • when not indexed - a bunch of tokenized sentences each packaged in its own 2D vector of shape (1,n).

Also, it is not clear why we need to tokenize and extract one line only by
# Tokenize the current line
token_list = tokenizer.texts_to_sequences([line])[0]