Is this true: "Encoder-only models are also known as autoencoding models"?

myusername · July 18, 2023, 5:48pm

The lecturer said: “Encoder-only models are also known as autoencoding models.”

But, Wikipedia says that “An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function”

So, does autoencoder have a decoding function? The video and Wiki are saying opposite things. I’m very confused.

Video can be found in Week 1 “Pre-training large language models” @ time 3:43.

Juan_Olano · July 18, 2023, 9:30pm

Valid point. Encoder-only models are exactly that: Encoder. They don’t contain the ‘decoder’ part of the transformer.

Autoencoders do have an encoder but also, as you point out, they also have a decoding functions.

myusername · July 18, 2023, 10:11pm

Thanks, Juan. Hopefully, that part of the video will be updated in the next iteration of the course.

Juan_Olano · July 18, 2023, 11:23pm

I’ve sent a signal to the monitors - lets wait and see how this evolves. Thanks!

rmwkwok · July 18, 2023, 11:26pm

Hello @myusername,

I think we can’t mix up the two sets of terminology here.

In the wikipedia’s autoencoder, it indeed has an encoder and a decoder, and a perfect decoder is expected to reproduce the input sentence.

Source: that wikipedia’s page.

In a transformer encoder, it alone is expected to reproduce the input sentence.

Source: the BERT paper.

They just use the same english word “encoder”, but they have different meaning in their contexts. You may say that the transformer encoder is called the “transformer encoder” because there exists an option of “transformer decoder” downstream, and we need to cope with the naming conventions.

Cheers,
Raymond

tharunnayak14 · July 27, 2023, 2:18pm

Encoder-only models and autoencoders are different concepts.
Encoder-only models do not have a decoder and produce a fixed-size representation from input.
Autoencoders consist of both an encoder and a decoder.
Encoder-only models are used when only the encoding part is needed, while autoencoders are used for unsupervised learning tasks to compress and reconstruct data.

Christian_Simonis · July 27, 2023, 5:59pm

This is how autoencoders are trained, with both: an encoder and a decoder.

However dependent on the use case autoencoders can also be used only with the encoder in inference mode, e.g.: when it comes to dimensionality reduction for further downstream tasks: here the trained encoder can take care of feature extraction into a low dimensional space resp. efficient compression of highly dimensional data. Usually these data are after dimensionality reduction processed further e.g. w/ clustering, classification or visualization using the efficient latent space representation.

Best regards
Christian

raj.prajesh · May 18, 2025, 1:51am

The course content has still not been updated on this. It is very frustrating for a beginner learner of this topic because it is such a distraction. The forum discussion also seems to point to different meanings - Many suggesting the video is wrong and 1 person suggesting it is a naming convention. Can someone ELI5?

Topic		Replies	Views
Decoder only model vs encoder+decoder models Generative AI with Large Language Models week-module-1	1	721	July 27, 2023
Why use Encoder-Decoder Models? Generative AI with Large Language Models week-module-1	3	1587	February 1, 2024
Questions about transformer architecture Generative AI with Large Language Models ai-discussions	1	46	October 8, 2024
If GPT is decoder only architecture, how do they do classification task and vice-versa? GenAI with LLMs Resources	2	1222	August 10, 2023
Autoencoder suitable for compression Structuring Machine Learning Projects coursera-platform	4	617	October 17, 2022

Is this true: "Encoder-only models are also known as autoencoding models"?

Related topics