Neural style transfer programming exercise, compute_style_cost()

You are obviously putting a lot of time and energy into this assignment. But to be fair, the code is not so clear and has lots of “layers” to it (pun intended).

Here’s the text of section 5.5.2:

5.5.2 - Compute the Style image Encoding (a_S)

The code below sets a_S to be the tensor giving the hidden layer activation for STYLE_LAYERS using our style image.

So this statement from your post is not correct:

No, they are not. They are subsets determined by STYLE_LAYERS, so they each will have 6 layers. The reason it is 6 instead of 5, is that the vgg_model_outputs function has been defined to add one extra “content layer”:

content_layer = [('block5_conv4', 1)]

vgg_model_outputs = get_layer_outputs(vgg, STYLE_LAYERS + content_layer)

If you look carefully at the code in compute_style_cost, you’ll see it discard the last layer:

    # Set a_S to be the hidden layer activation from the layer we have selected.
    # The last element of the array contains the content layer image, which must not be used.
    a_S = style_image_output[:-1]

So you end up with 5 and everything matches.

To convince yourself how this is working, you can actually add some instrumentation to the cell that creates the global a_S value that is referenced in train_step:

# Assign the input of the model to be the "style" image 
preprocessed_style =  tf.Variable(tf.image.convert_image_dtype(style_image, tf.float32))
a_S = vgg_model_outputs(preprocessed_style)

# Paul addition:
print(f"len a_S {len(a_S)}")
for ii in range(len(a_S)):
    print(f"shape a_S[{ii}] = {a_S[ii].get_shape()}")

Try running that cell with the added print code and watch what happens. Maybe it’s worth harping on the “meta” point here: if you are not sure what’s happening in the code, it’s time to start adding some instrumentation to see what is going on. You don’t have to wonder: you can actually add code to see it.

1 Like