Hi everyone, I was working on a project where I am trying to train an autoencoder on STL10 data. Right now, I am just experimenting with certain architectures to get good reconstruction quality but unfortunately, no matter which architecture I try, I am unable to get good reconstruction quality. The main issue I am facing is poor color reconstruction. I have tried using resnet18 as encoder and used an several decoders (10-15 layers, with / without skip connections).
Some details of the training procedure –
- using ‘train+labelled’ dataset of STL10
- encoders used – resnet18 (pytorch model, without pretrained weights) and smallAlexnet (link)
- decoders used – tried my own architectures (for both resnet and alexnet encoders)
- values of hyperparameters taken from the alexnet repo (link above) – batchsize(b) = 512/768, Lr = 0.12*(b/256), Lr decays by factor of 0.1 on 100th, 150th and 180th epoch, Total epochs = 200, latent dimension = 128, SGD optimizer.
- As of now, using only pytorch’s nn.MSELoss() to calculate reconstruction loss
- Center cropping 64x64 size images
I have observed that loss stagnates at 0.007 at around 130-160th epoch, reconstructed images have very poor color reconstruction and outline reconstruction is still okay, though not very good either. FID score is 120-140.
Any suggestions on what I maybe doing wrong? I have tried experimenting with lot of architectures but results aren’t improving. (rest training procedures is same just changed architectures)
Let me know if I should share exact arch. of the decoder
Thanks!