I really like the idea of Neural Style Transfer and would like to create my own. In order to do this, is there a way I can download the code to the VGG19, or can I just run: tensorflow.keras.applications import VGG19, then set include top to false and trainable to false? After doing this I am assuming I can follow the steps used in Course 4 Week 4 of the deep learning specialization to make this my own model? Also, when running this in VS Code, would a laptop be able to run a VGG network with more than 19 layers? This is a lot of questions, and I would appreciate any help at all!
regarding the NN w/ 19+ layers, I believe this can be borderline but it will depend on your specs of your hardware and the data itself if it will work in a reasonable time on your laptop or computer.
Feel free to consider the use of GPUs. If you do not want to buy new hardware of course you can also „rent“ it in the cloud, e.g. via:
- Google Colab
- or cloud providers like Azure, AWS, Google Cloud
Side note: content-wise I did something similar concerning style transfer learning, see this repo w/ a MacBook Pro (Core i5 2320, 8 GB RAM) and in order to create one picture which you can see in the repo, training took about 10h + which is quite a lot.
Both pictures: style and content were taken with an iPhone camera and were not edited before training.
Hope that helps! Good luck with your project, @GageMars!
It’s definitely worth trying to run the Neural Style Transfer on your local computer. What happens in the notebook is pretty complicated, but if you look closely you’ll see that we don’t actually train VGG-19 in order to accomplish the style transfer. The training just works on the actual images themselves and is just using VGG-19 in inference (“prediction”) mode. I haven’t looked at the details of the layer sizes in VGG-19, but normally it’s training that’s the expensive thing because of the huge dataset sizes that are required. Just running the trained model shouldn’t be a problem.
Side note: I used gradient descent w/ Adam for the optimization problem w/ an overall loss which combined the weighted style loss and the content loss as suggested in the original 2015 paper, see formula 7, starting from a white noise picture on my machine (see HW specs in post above).
Also here you can find a nice step-by-step explanation by TF.
I would be quite interested how your training performance on your local machine is and how many pictures you want to „style“, @GageMars.
Feel free to keep us posted!
Correct. Like in the notebook in the assignment, we imported the VGG model from keras then ‘froze’ the layers to use their pretrained weights. We then set include top to false, and we created our own loss and last layers. Is this the process I would use to create my own without training an entire model?
What you are describing would be the process to do Transfer Learning with VGG-19, but that is not really what is happening here. Take another look through what happens in the notebook. Notice that we are not adding any new output layers to VGG-19. What we do is use the existing trained VGG-19 model in “inference” mode and then feed it three different images: the “style” image, the “content” image and our new “generated” image. Then we extract the activation outputs produced by a selection of internal hidden layers of VGG-19 on those three images and we define two different cost functions based on those hidden layer activation values. The overall loss is defined as a weighted sum of the two style and content losses.
Now look at how the training works: we are not training anything about VGG-19 or any additional layers added to that. We are applying the gradients generated by our custom loss function directly to the generated image.
At least that’s my reading of the code …