Token_list = tokenizer.texts_to_sequences([line])[0]

pham_su · April 13, 2022, 4:09pm

Hi everyone,

Can someone help me by explaining what that [0] at the end is doing?
I don’t understand. Maybe I missed something in the lecture
Thank you in advance

CSAlexiuk · April 13, 2022, 6:27pm

Hey Pham!

If you follow along with the notebook, you might see this line:

Notice that you received the sequence wrapped inside a list so in order to get only the desired sequence you need to explicitly get the first item in the list like this:

This is basically saying that: Because tokenizer.texts_to_sequences([corpus[0]]) returns the sequence wrapped in a list (which is saying: [[1,2,3,4,5]], we need to extract the sequence, which is just the 0th element in our list!

Hopefully this explanation is clear! Please reach out if you have any further questions!

pham_su · April 13, 2022, 9:23pm

oh, sorry, I should have read it carefully.
Thank you for your explanation, it is super clear

Robert_Ascan · March 4, 2023, 11:36pm

Can you please clarify what you meant? In the Lab 2 the code snippet in question is written as tokenizer.texts_to_sequences([line])[0], not as you put it tokenizer.texts_to_sequences([corpus**[0]**]). According to the Tensorflow API docs tokenizer.texts_to_sequences() method returns a list of sequences. My understanding of the API docs was that it should return a list of lists of tokenized sentences. However, when I printed it out what it returns I got:

when the first element indexed (i.e. extracted with [0]) a bunch of tokenized sentences each packaged in its own list or, in other words, 1D vectors of shape (n,);
when not indexed - a bunch of tokenized sentences each packaged in its own 2D vector of shape (1,n).

Also, it is not clear why we need to tokenize and extract one line only by
# Tokenize the current line
token_list = tokenizer.texts_to_sequences([line])[0]

Topic		Replies	Views
Wk 4, Lab 2: token_list = tokenizer.texts_to_sequences([line])[0] Natural Language Processing in TensorFlow week-module-4	3	290	March 5, 2023
I keep on getting 0 points on seq_and_pad Natural Language Processing in TensorFlow week-module-2 , week-module-3 , week-module-4	3	557	May 20, 2022
Difference sequence other than the expected output Natural Language Processing in TensorFlow week-module-1	1	354	October 25, 2023
C3_W4_Lab_1: 'Sequential' object has no attribute 'predict_classes' Natural Language Processing in TensorFlow week-module-4	4	469	March 21, 2022
Get_padded_sequences Natural Language Processing in TensorFlow week-module-1	6	552	December 23, 2022

Token_list = tokenizer.texts_to_sequences([line])[0]

Related topics