Course 4 Week 4 Programming Assignment2

why same layers are used to encode both content and style image, As style uses 5 layers and content uses 1 layer but 6 layers are used to encode both of them

A layer output can be thought of as representation of the input.

In GAN, we want to generate a new image where the content image is painted in the theme of the style image. Instead of comparing 2 images directly, we use layer output to get an estimate of the distance between 2 images.

A single model is sufficient since we care about the representation of each image across the layers of interest.

Thanks for the reply! I have one more question "Does the content and style image get encoded by same number of layers i.e (5 Style_layers +1 content_layer) 6 layers? "

When comparing the style of the generated image and the original style image, the style layers are considered. Similarly, when comparing the content image and the generated image, content layers are used.

1 Like

But in the programming exercise same layers are used to compute both encoding for content and style images

There is no set rule to select content and style layers. The closer a layer is to the model input, the generated image will look more like the content.

That said, you should follow the hints in the assignment to use the appropriate layers.

In Programming exercise, why same layers are used for content and style encoding. That’s my question.

I agree that the way it is written could be considered a bit misleading. To figure out what’s really happening, you have to look more closely at the code. They compute all six layers for both the style and content images, but look at the logic in the compute cost functions. The style one uses all but the last layer and the content one uses only the last layer.

Can you share code or screenshot of the cost functions where specific layers are used?

It’s part of the template code that they gave you. Here is the beginning of the template code for compute_content_cost function:

    a_C = content_output[-1]
    a_G = generated_output[-1]

So it is treating both of those inputs as arrays or lists and is selecting the last element of each one along the first dimension, right? That’s what -1 means as a python index value: the last element in the list or array dimension.

With that in mind look at the beginning of compute_style_cost and you’ll see it select all but the last element. They even tell you what they are doing in the comments:

    # Set a_S to be the hidden layer activation from the layer we have selected.
    # The last element of the array contains the content layer image, which must not be used.
    a_S = style_image_output[:-1]

    # Set a_G to be the output of the choosen hidden layers.
    # The last element of the list contains the content layer image which must not be used.
    a_G = generated_image_output[:-1]

Maybe that first comment could use a little cleanup, but the combination of the two comments and the code should be pretty clear.

1 Like