04_Data_preparation_lab_student

In the notebook for 04_Data_preparation_lab_student we have the following code:

def tokenize_function(examples):
    ...

    tokenizer.pad_token = tokenizer.eos_token
    tokenized_inputs = tokenizer(
        text,
        return_tensors="np",
        padding=True,
    )

    max_length = min(
        tokenized_inputs["input_ids"].shape[1],
        2048
    )
    tokenizer.truncation_side = "left"
    tokenized_inputs = tokenizer(
        text,
        return_tensors="np",
        truncation=True,
        max_length=max_length
    )

    return tokenized_inputs

Why is the tokenizer being called twice? Why can we not use the following code instead using single call to tokenizer?

    tokenizer.pad_token = tokenizer.eos_token 
    tokenizer.truncation_side = "left"
    max_length = 2048 
    # input with > 2048 tokens will be truncated at the left side to 2048 tokens
    # input with < 2048 tokens will remain unchanged
    # even padding=True is not needed, if this is not being batched while mapping to dataset 
    tokenized_inputs = tokenizer(
        text,
        return_tensors="np",
        truncation=True,
        max_length=max_length
    )