Why is Flan T5 used as the base model in lab 1

I’m curious as to why this model was chosen and am curious as to if folks are still using it as a base model for their own work?

FLAN T5 is a Hugging Face LLM. As the instructors are from AWS and AWS recently partnered with Hugging Face, they would have preferred to use this LLM. I believe it’s a counter move from AWS to partner with Hugging Face, as Microsoft made good leap on AI after they partnered with OpenAI.

You can find answer from Week 2 video → multi-task-instruction-fine-tuning


Thank you for responding, looking forward to seeing the answer in week two.

I also want to ask the same, thanks for asking!

Flan T5 helps you play around and learn LLMs without incurring any costs. It also is very capable and can do a wide variety of tasks.