Variational Auto encoder

can i use vision transformer insted of cnn in encoder in VAE beacuse the VIT extract more information rather than cnn

yes you can use, only you need to make sure convert the images into tensor type.

Here is the link related to that, probably you already have seen

as far my understanding VIT iS VAE MODEL using the transformer encoder.