W4. Why don't we use every layer in Content Cost Function?

There are two kinds of cost function in neural style transfer, content cost function and style cost function.

The style cost function is defined as “weighted sum of each layer’s style cost”, utilizing every layer when calculating.

On the other hand, the content cost function is defined as L2 Norm of activations from content image and generated image, “at a certain layer l”.

Why do we use only one layer when we calculate content cost function? Can’t it be a weighted sum of each layer, just as we did on style cost function?

Hey @allegro6335,

Well to address this point you have to make sure that you understand “Content cost function” & “Style cost function” and their purposes.

The style cost function aims to capture the statistical properties of the style image, such as texture, color, and overall artistic style. So to achieve something like that, you need to consider multiple layers of the neural network. Each layer captures different levels of information, from low-level features like edges and textures to high-level features like object shapes. By using a weighted sum of the style costs from multiple layers, you can balance the importance of style information at different levels of abstraction, resulting in a richer and more complex style transfer.

On the other hand when talking about content cost function its primary goal is you want to ensure that the generated image preserves the content of the original image. By using the activations from a specific layer, you are essentially targeting a certain level of abstraction in the image. Using activations from multiple layers for the content cost would make it more challenging to control the content preservation because different layers would be contributing to the overall cost. It might lead to a loss of fine-grained control over the generated image’s content.

Hope it makes sense for you now.