Layer selection for content cost

aliakyurek · July 8, 2021, 8:31am

Hi,

Week 4: Neural style transfer assignment.
In the lesson and in the notebook text, it’s said that a middle layer must be selected to evaluate the content cost. But in the notebook code the last layer ‘block5_conv4’ is selected.
Do I miss something?

TMosh · July 8, 2021, 7:59pm

Which week number and which assignment?

aliakyurek · July 8, 2021, 8:12pm

sorry, just edited in the original post.

paulinpaloalto · July 8, 2021, 9:06pm

They do talk about using middle layers here in the notebook as well. Here’s some verbiage from early on:

To choose a “middle” activation layer 𝑎[𝑙]a[l] :

You need the “generated” image G to have similar content as the input image C. Suppose you have chosen some layer’s activations to represent the content of an image.

In practice, you’ll get the most visually pleasing results if you choose a layer in the middle of the network–neither too shallow nor too deep. This ensures that the network detects both higher-level and lower-level features.
After you have finished this exercise, feel free to come back and experiment with using different layers to see how the results vary!

The code here is pretty confusing and hard to read, but I think you’re just missing the details of what they are doing. Notice that first they define this:

STYLE_LAYERS = [
    ('block1_conv1', 0.2),
    ('block2_conv1', 0.2),
    ('block3_conv1', 0.2),
    ('block4_conv1', 0.2),
    ('block5_conv1', 0.2)]

So that gives you five conv layers spread throughout the predefined model layers. Then later they add one more layer which is block5_conv4:

content_layer = [('block5_conv4', 1)]

vgg_model_outputs = get_layer_outputs(vgg, STYLE_LAYERS + content_layer)

So vgg_model_outputs ends up being a pre-defined function that will return a list of 6 activation outputs from those 6 different selected layers of the network.

To see what is going on, I added some print statement to this template cell from the notebook:

# Assign the content image to be the input of the VGG model.  
# Set a_C to be the hidden layer activation from the layer we have selected
preprocessed_content =  tf.Variable(tf.image.convert_image_dtype(content_image, tf.float32))
a_C = vgg_model_outputs(preprocessed_content)
print(f"len a_C {len(a_C)}")
for ii in range(len(a_C)):
    print(f"shape a_C[{ii}] = {a_C[ii].get_shape()}")

Here’s what I get by running that:

len a_C 6
shape a_C[0] = (1, 400, 400, 64)
shape a_C[1] = (1, 200, 200, 128)
shape a_C[2] = (1, 100, 100, 256)
shape a_C[3] = (1, 50, 50, 512)
shape a_C[4] = (1, 25, 25, 512)
shape a_C[5] = (1, 25, 25, 512)
tf.Tensor(0.01646365, shape=(), dtype=float32)

So everything they are doing in train_step with a_C, a_S and a_G is using those same 6 layers including 5 “middle” layers and then the last conv layer before the final pooling layer.

I guess the other subtlety here is that the weights of the first 5 “middle” layers are set to 0.2 apiece, but the final conv layer gets a weight of 1. So that means they are giving it as much influence on the result as the other 5 together. As they say in the notebook, once we have all this working we can then try experiements with both which layers we use and the relative weighting. Disclaimer: I have not personally tried any such experimentation.

Topic		Replies	Views
Course 4 week 4 second assignment - content cost Convolutional Neural Networks	4	530	March 10, 2022
Choosing activation layer representing the content image Convolutional Neural Networks cnn	2	62	June 16, 2024
Questions about Art_Generation_with_Neural_Style_Transfer Convolutional Neural Networks	10	585	March 12, 2023
Week4: Neural style transfer- layers choice in style cost function Convolutional Neural Networks	2	502	June 16, 2022
C4W4 Art Generation - need help at 5.5.1 - Compute Content Cost Convolutional Neural Networks	3	736	December 8, 2024

Layer selection for content cost

To choose a “middle” activation layer 𝑎[𝑙]a[l] :

Related topics