The lecturer said: “Encoder-only models are also known as autoencoding models.”
But, Wikipedia says that “An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function”
So, does autoencoder have a decoding function? The video and Wiki are saying opposite things. I’m very confused.
Video can be found in Week 1 “Pre-training large language models” @ time 3:43.
1 Like
Valid point. Encoder-only models are exactly that: Encoder. They don’t contain the ‘decoder’ part of the transformer.
Autoencoders do have an encoder but also, as you point out, they also have a decoding functions.
1 Like
Thanks, Juan. Hopefully, that part of the video will be updated in the next iteration of the course.
I’ve sent a signal to the monitors - lets wait and see how this evolves. Thanks!
2 Likes
Hello @myusername,
I think we can’t mix up the two sets of terminology here.
In the wikipedia’s autoencoder, it indeed has an encoder and a decoder, and a perfect decoder is expected to reproduce the input sentence.

Source: that wikipedia’s page.
In a transformer encoder, it alone is expected to reproduce the input sentence.

Source: the BERT paper.
They just use the same english word “encoder”, but they have different meaning in their contexts. You may say that the transformer encoder is called the “transformer encoder” because there exists an option of “transformer decoder” downstream, and we need to cope with the naming conventions.
Cheers,
Raymond
This is how autoencoders are trained, with both: an encoder and a decoder.
However dependent on the use case autoencoders can also be used only with the encoder in inference mode, e.g.: when it comes to dimensionality reduction for further downstream tasks: here the trained encoder can take care of feature extraction into a low dimensional space resp. efficient compression of highly dimensional data. Usually these data are after dimensionality reduction processed further e.g. w/ clustering, classification or visualization using the efficient latent space representation.
Best regards
Christian
The course content has still not been updated on this. It is very frustrating for a beginner learner of this topic because it is such a distraction. The forum discussion also seems to point to different meanings - Many suggesting the video is wrong and 1 person suggesting it is a naming convention. Can someone ELI5?