I’ve tried to build my own convolution autoencoder, input shape is (491,493,3)
I’m still having problem in selecting number of layers , pooling, kernel_size…
with no scientific base I build my encoder to be 5 conv-maxpooling layers and decoder 5 convTranspose-upsampling layers
on calling model.fit I got this error
I don’t have access to your dataset. Should we continue to have a discussion around this topic, let’s use a common dataset like cifar10
.
Here are a couple of pointers:
- Model input dimension should match image dimension.
- When decoding, why use
Conv2DTranspose
and Upsampling
used in adjacent layers? They both go in the other direction of convolution. Use one of them. Consider using Conv2D in decoder path as well.
- Do you need your NN to be that deep?
- When using Conv2D, usual practice is to keep the number of filters as a power of 2.
- Loss function for the model is wrong. Since we want to recreate the same image, a distance measure between pixels of original image and recreated image should be used.
- Label should be the image.
- Tune adam only if necessary.
1 Like