What does seq2seq mean in Transformer?

Goomin · April 23, 2024, 1:20am

When creating a post, please add:

Week # must be added in the tags option of the post.
Link to the classroom item you are referring to: Coursera | Online Courses & Credentials From Top Educators. Join for Free | Coursera
Description (include relevant info but please do not post solution code or your entire notebook):

There’s something I don’t quite understand yet.
seq2seq is a pre-transformer model. But why does the lecture refer to the transformer encoder-decoder model as seq2seq? I don’t really understand. Please explain.

gent.spah · April 23, 2024, 5:54am

Here is a nice article about seq2seq models:

Deepti_Prasad · April 23, 2024, 7:33am

Hello @Goomin

Your query although comes from GenAI with LLMs course but seems more specific to NLP specialisation where it explains sequence to sequence models are trained to convert a sequence of input data (such as a sentence in one language) into a sequence of output data (such as a sentence in another language).

The architecture of a seq2seq model typically consists of two main parts: an encoder and a decoder.

The encoder takes the input sequence and converts it into a fixed-length vector representation, often referred to as the context vector or hidden state. The decoder then takes this context vector and generates the output sequence, step by step.

Sequence to Sequence models is a special class of Recurrent Neural Network architectures that we typically use (but not limited) to solve complex Language problems like Machine Translation, Question Answering, creating Chatbots, Text Summarization.

Where as in the pre-transformer, Word2Vec and GloVe were the two main methods for producing dense embeddings. These methods have a one-to-one mapping between a word and its embedding representation.

Feel free to ask if you have more doubt!!!

Regards
DP

Topic		Replies	Views
Difference between RNN and seq2seq NLP with Attention Models week-2	1	832	July 21, 2023
Vector in Encoder (Transformer Architecture) AI Discussions	1	104	December 26, 2023
Transformer Network - Question about "N" Sequence Models week-4	5	141	May 12, 2024
W1 seq2seq lecture question NLP with Attention Models week-1	8	281	March 15, 2024
What's the point of an RNN encoder in seq2seq models with attention? Sequence Models	3	560	June 7, 2022

What does seq2seq mean in Transformer?

Related topics