What is the network architecture for Neural Style Transfer?

Meir · May 24, 2021, 11:10am

A critical piece is missing in the explanation of Neural Style Transfer - what is the network architecture? My guess is that it is a Siamese network with three input images - the generated, the content and the style. Note that this would be a very unusual architecture, because the parameters being trained are in the input layer (i.e. the generated image). Am I correct?
Also, can you train the parameters of the deeper layers simultaneously with training the generated image?

Jaskeerat · May 24, 2021, 1:19pm

Hi, interesting question. You don’t actually need a ‘model’ as such. You’re not training with multiple content, styles and generated images. Style Transfer is an algorithm that works with just 2 images, Content and Style. I.e You could give me any 2 images: one content and one style and I could run style transfer for you, provided I have some pretrained image classifier at hand just for the encodings, the image classifier could be anything, all I care about is that the image classifier has learnt to keep meaningful extracted data from the original image that holds information about the content and style. I then calculate content loss, style loss and randomly initialize a generated image and work on reducing the losses. Et Voila, over many iterations of reducing the loss, the generated image will be my desired output.

Jaskeerat · May 24, 2021, 1:21pm

There is no input/output kind of model. You are basically minimizing loss with just 2 images’ encodings. I suggest you think of neural style transfer as an algorithm which is an application of CNNs than a model/architecture in itself.

Meir · May 24, 2021, 2:35pm

Thank you for the explanation. I think that this is a critical point and it should be stated explicitly in the videos, as it is not self-evident.

rleffler · January 19, 2022, 12:13pm

I would think the results of the neural style transfer might depend somewhat on the images in the training set for the pre-trained image classifier. For example, if the classifier is trained on images of faces vs text vs sign language. Does anyone have experience confirming this?

TMosh · January 20, 2022, 5:21am

The results of training always depend on the contents of the training set.

ai_curious · January 20, 2022, 2:50pm

Performance of NST does indirectly depend on the training images, but I think it is important to remember how that information is retained and used during NST, since the training set inputs are long gone whenever you’re doing transfer learning. Clearly, it is through the learned weights. And they, in turn, depend not only on the original training set but on both the original model / architecture and the loss function that was used during training. I think the statements above that “there is no input/output kind of model” and “the image classifier could be anything at all” undervalue that importance. Without an underlying model with useful feature extraction layers, and without weights learned from a suitable loss function, NST outputs won’t be pretty, either aesthetically or from a computer science standpoint. NST also depends heavily on the measurement of style similarity on the two input images (Style and Content). The paper referenced in this exercise, and its code, use the Gram matrix. But that isn’t the only choice available (see the paper Neural Style Transfer: A Review linked below).

The ability of NST to produce interesting output does depend on the original training images, but also on the original classifier network architecture, the original loss function, the learned weights retained in the trained model, choice of style and content layer in the classifier network, and the choice made for measuring Style similarity.

Here are some related links for further contemplation:

The original VGG paper: https://arxiv.org/pdf/1409.1556.pdf
An implementation of VGG-19 in Python and Keras: deep-learning-models/vgg19.py at master · fchollet/deep-learning-models · GitHub
The Gatys et al paper: https://arxiv.org/pdf/1508.06576.pdf
A review of style transfer, both before and since the Gatys paper: https://arxiv.org/pdf/1705.04058.pdf

Topic		Replies	Views
Neural style transfer Convolutional Neural Networks	1	488	September 6, 2022
Week4 Quiz Doubt Convolutional Neural Networks	4	577	May 22, 2021
Neural Style Transfer - general question Convolutional Neural Networks	1	505	May 18, 2022
Neural style transfer doesn't learn any parameter? Convolutional Neural Networks week-4	2	16	August 25, 2024
Neural style transfer 'trains pixels' Convolutional Neural Networks	3	592	May 25, 2022

What is the network architecture for Neural Style Transfer?

Related topics