Image upsampling in StyleGAN

Richeek_Arya · July 20, 2023, 6:07pm

While it feels correct to start with low res images and gradually increase resolution, like how we learn any topic by gradually increasing the complexity, I am not able to clearly define in words how it actually helped or why did it even come to authors’ mind?

Is it just pure experimentation or there is a mathematical bases to it?

Dmitriy_Khvan · July 20, 2023, 9:36pm

Hi, @Richeek_Arya! Thank you for your question.
What you’re referring to is called progressive growing. As mentioned in the video lecture, this technique of gradually increasing the image resolution during training was first proposed in ProgressiveGAN.
As per intuition behind gradual upsampling of the input image during training, I think there’re couple of reasons to do so.

More stable training. In one of the introductory lectures in was mentioned that it’s critical that the generator and discriminator learn in unison - generator has more difficult task to produce realistic output than the discriminator that predicts whether an image is real or fake. Starting to train with smaller resolution images helps to balance the training of the generator and the discriminator.
Image as hierarchy of features. As an analogy you can think of deep convolutional networks where in early layers convolutional kernels learn coarse features - shapes, silhouettes etc. whereas closer to the last layer convolution kernels learn fine features like high frequency details in an image. Similarly to this, progressive growing is aiming at realistically producing low level as well as high level features.
Hope this helps in your exploration of the matter. Please tag me if you have additional ideas or suggestions. Thank you.

Richeek_Arya · July 21, 2023, 9:11am

Thanks for your response! I wanted to ask couple of follow up questions.

For point 1: That is by intuition right? Just checking if there is a mathematical basis to it as well?

For point 2: I have read it as well but never understood that. Could you please comment on how you arrived at this? Is it like we plot something at the end of every layer and deduce it?

Dmitriy_Khvan · July 21, 2023, 9:53am

Hi, @Richeek_Arya.

There’s definitely a mathematical reasoning behind this however it’s not something I can put in a single formula. Instead, please recall that in GANs we calculate 2 losses - for generator and discriminator. Imagine a situation when the discriminator loss instantly drops (or explodes) and doesn’t change throughout the training process. In such case there’s a high possibility the generator won’t be able to learn from discriminator’s feedback and as a result GAN training becomes unstable. So there’s a computational reasoning behind progressive growing of input images.
The input to the first layer is a normalised array of input pixels to which a set of convolution kernels is applied. The output of the first layer is a set of N (# of output channels) feature maps that are fed to the next layer. As a result, convolution kernels at each layer “learn” features present in a given feature map set. As training advances convolution kernels at each layer produce an hierarchy of learned features.
Hope I didn’t confuse you even more. To summarise, to answer your original question - there’s a computational reasoning for progressive image growing.

paulinpaloalto · July 21, 2023, 2:36pm

For a deeper dive on point 2) that Dmitriy is making here, one place to look would be the lecture “What Are Deep ConvNets Learning” from Prof Ng in DLS C4 W4. It’s also available on YouTube. There he describes some really interesting work that examines what is happening in the hidden layers of a trained ConvNet.

Nydia · July 21, 2023, 3:27pm

Hi, @Richeek_Arya is nice to see you are interested in a deeper understanding. While my colleagues have written concise good points. I would like to refer you to one of the most important models that brought up the importance of upsampling: the U-Net.

Paper:
“U-Net: Convolutional Networks for Biomedical Image Segmentation” Ronneberger, O., Fischer, P., Brox, T. (2015)
Link: U-Net: Convolutional Networks for Biomedical Image Segmentation | SpringerLink

(it is available via Arxiv as well)

Richeek_Arya · July 22, 2023, 4:45am

Thanks this helped a lot

Topic		Replies	Views
Full picture of progressive growing not clear Build Better Generative Adversarial Networks week-3	1	439	July 2, 2023
Efficient Net Clarification Convolutional Neural Networks	2	526	November 5, 2021
The Deterioration of generated image with each epoch after certain numbers Build Basic Generative Adversarial Networks week-2 , week-3	1	499	November 13, 2022
Doubt about question asked in "Putting It All Together" Build Basic Generative Adversarial Networks week-1	2	517	January 8, 2023
Blocks for making a WGAN-GP to generate a fictional character Build Basic Generative Adversarial Networks week-3	5	245	June 2, 2023

Image upsampling in StyleGAN

Related topics