C4W2 Number of Parameters increase in Bottleneck

mirko.bruhn · June 30, 2023, 8:02am

I think, in contrast to the dense Neural Network, the autoencoder with convolutions is not really limiting the parameters, but actually is increasing them.
We have an Input of 28x28x1 and in the next later the dimensions are already 28x28x64 (the graphic is a bit misleading).
So it is not a bottleneck? It is more like an extraction of features? Or is this only in this example?
Maybe the better performance is due to some form of overfitting and memorizing the images?

gent.spah · June 30, 2023, 9:10am

That 28x28x64 doesnt have all parameters trainable, that is the output image size, only the kernels of the convokutions have trainable parameters.

mirko.bruhn · June 30, 2023, 2:45pm

Yes, you are right. But when I look at the model summary, the bottleneck has by far the most trainable Parameters compared to all other layers. The logic seems to be different compared to the fully connected network.

gent.spah · July 2, 2023, 8:50am

The principle is different, one can possibly say that, in the downsample you extract fearures continously (more and more filters) so the number of parameters increase.

Stefan_Will · August 12, 2023, 3:16pm

Just to jump in here, since I have the same question…

As was mentioned, in the dense autoencoder, the bottleneck was a dense layer with 32 neurons, meaning if I feed in a 28x28x1 (784 pixel) image into the encoder, it produces a 1x32 vector as the latent representation, which is then fed into the decoder network to reproduce the image:

(28x28) → encoder → (1x32) → decoder → (28x28)

Now, with the CNN version, the bottleneck layer of the encoder network is a Conv2D layer with 256 filters, that takes in the 7x7 output of the previous maxpool2d layer. What is the dimensionality of the latent representation in that case? Wouldn’t it be 7x7x256, or 12,544?

(28x28) → encoder → (7x7x256) → decoder → (28x28)

For visualization purposes that 7x7x256 is then collapsed using an additional Conv2D(1) layer to produce a 7x7 image. But in contrast to the 32 dimensional vector in the dense network, this 7x7 = 49 image is not the actual latent representation, is it?

In terms of bits if information, the input has 28x28x16 = 12,544 bits, while the latent vector has (assuming 16 bit floats), something on the order of 200,704 bits, which is way more than the input, not less (by a factor of 16) - not exactly an information bottleneck.

Topic		Replies	Views
Question about auto-encoder visualization Generative Deep Learning with TensorFlow week-module-2	15	398	March 1, 2024
Week 1 Assignment 2 parameters Convolutional Neural Networks coursera-platform	1	532	August 10, 2021
C4W2 Lab4: Convolutional Encoder Model - Training? Generative Deep Learning with TensorFlow week-module-2	13	615	September 6, 2024
Number of Parameters in a CNN model Convolutional Neural Networks week-module-1 , coursera-platform	1	96	June 30, 2024
Encoder_visualization layer Generative Deep Learning with TensorFlow week-module-2	8	611	February 11, 2022

C4W2 Number of Parameters increase in Bottleneck

Related topics