FlanT5- maximum input and output length?

Ali_issa · August 18, 2023, 1:14pm

Hello, could someone provide more information regarding the maximum input and output size of the Flan-T5 models? While reading the paper, I noticed it was trained on 1024 input length and 256 output length, but I also saw conflicting information. Can someone please clarify? Thank you.

carloshvp · September 8, 2023, 4:41pm

Hi @Ali_issa , welcome to the community!

As token limitations in the input/output are inherent limitations of the LLM models, I would assume this depends on the model you choose. As you can see in Huggingface, there are several versions of Flan-T5, which may lead to different input/output token limitations.
Does it make sense?

Ali_issa · September 8, 2023, 4:56pm

Hello @carloshvp , thank u for your answer. But i was doing some research about Flan-T5-large. And i am not sure about the specific input and output lengths. So if you khow how we can obtain the length or if u already familiar with the input/output size of Flan-T5-large .Feel free to share the numbers with me

carloshvp · September 9, 2023, 3:12pm

Hi @Ali_issa ,

as you know FLAN-T5 is an instruction fine-tuned variant of T5. Looking at the paper behind T5 (here), it looks like the used a maximum sequence length of 512 tokens, which means, anything beyond that will probably give bad results. There is however some literature about extending the context size, but that is another topic.

Probably an easier way to find the window size is to look at the d_model parameter in the T5 model documentation at Hugging Face, where you can also see the same 512 value

carloshvp · September 9, 2023, 3:16pm

I am not sure however that the d_model is the parameter to look at. In previous versions of hugging face documentation, there was a very useful and clear parameter called n_positions, which is exactly what you are searching for (and it is also 512)

carloshvp · September 9, 2023, 3:26pm

if you are happy with the reply, please mark it as the solution to your question, that would help me

Topic		Replies	Views
Flan_T5 and other LLM parameters Generative AI with Large Language Models week-1	2	446	July 3, 2023
Is it the Flan-T5 base model being used? Generative AI with Large Language Models week-1	2	321	December 23, 2023
Lab 1 - FlanT5 model output Generative AI with Large Language Models week-1	1	426	July 28, 2023
Large model to deal with longer sentences GenAI with LLMs Resources week-1	2	736	January 14, 2024
No executable batch size found Generative AI with Large Language Models week-2	0	429	November 1, 2023

FlanT5- maximum input and output length?

Related topics