I realize that Flan-T5 comes in several different sizes, but at the start of Lab 1 it was mentioned that the Flan-T5 base model was being used. Can anyone confirm whether it is indeed the 250M parameter base model?
I could not find which Flan-T5 model was being utilized in the unfortunately - perhaps I just missed it…
Tx!