Style weights (style cost function related)

In the last assignment, in the style layer part, there is written that : the deeper layers capture higher-level concepts, and the features in the deeper layers are less localized in the image relative to each other.

  • if you want the generated image to softly follow the style image, try choosing larger weights for deeper layers and smaller weights for the first layers.
  • if you want the generated image to strongly follow the style image, try choosing smaller weights for deeper layers and larger weights for the first layers.

Why is this the case ? could someone please explain it ? In this case, when the weights are higher does it mean that they are more “important” or does it mean that they are more “sensitive” to fine tuning/fit better to get a better style at the end ? Also what does it mean that the features are less localized in the image relative to each other ?

thank you all.

This is just restating the points made in the earlier lecture in C4 W4 titled “What are Deep ConvNets Learning?” The earlier layers learn to detect simpler features like edges, then as you go through the layers, the deeper layers learn to assemble the simpler features into higher level objects: two edges that meet at a certain angle and appear to have fur on them are likely a cat’s ear. (Just making up an hypothetical example there).

In terms of how altering the specific internal layers we choose for the style and the relative weight values we assign to them, you can also try some experiments yourself and see if you can reproduce the effects that they are describing here.

In the limiting case of a cat classifier, the final layer just gives a “yes” or “no” answer, right? So that is not localized at all: it applies to the entire image. That’s what they mean: the deeper layers are integrating patterns that may cover larger portions of the image.

Hello,

Thank you very much for your answer Sir. But even if I would be able to reproduce the effects that are described above, I still do not understand why it is the case.

For example : if features are less localized in the image relative to each other in the deeper layers, why will - smaller weights for deeper layers and larger weights for the first layers - generate an image that strongly follow the style image ?

It means you’d be putting more emphasis on the earlier layers which are more localized and represent “finer grained” aspects of the image and less emphasis on the layers that are more diffuse and cover a wider area.

But all this is just an intuition. Maybe they are just making that up and haven’t really tried it. You can run the experiments yourself and see if you can see any effect.