Few doubts regarding the pre-training and working of t5 transformers

Karan_L · November 6, 2023, 4:24pm

Hi, I have few doubts regarding the pretraining and working of T5 transformer.

In the masked language model approach of pre-training what are the inputs to the encoder and decoder of the model? How is the loss calculated?

Younes said that the input to the model is original text but with few tokens masked with special mask tokens and target being the masked tokens delimited by the special tokens, but I did not understand the flow of the pre-training for t5, so can someone elaborate a bit more so that I can get a better intuition of it?

How does t5 model handles different types of tasks? How was the pre training of supervised tasks like regression done in text-to-text format?

arvyzukai · November 8, 2023, 7:43am

Hi @Karan_L

I think this answers your questions. Feel free to ask if it’s still confusing.

Cheers

Karan_L · November 9, 2023, 9:01am

Thanks @arvyzukai .

Topic		Replies	Views
BERT pretraining NLP with Attention Models week-3	1	347	February 6, 2024
Transformer Decoder Mask Input NLP with Attention Models week-3	1	520	August 12, 2022
Clarification about Course 4 Week 3 HW NLP with Attention Models week-3	2	595	May 6, 2022
Week 4: Transformer Network (test time intuition) Sequence Models	1	516	April 21, 2022
Questions about Transformer Models Generative AI with Large Language Models week-1	2	363	October 23, 2023

Few doubts regarding the pre-training and working of t5 transformers

Related topics