Course 4 week 4 second assignment - content cost

In the lectures we are told to pick a middle layer for the content cost, why does the exercise use “block5_conv4” at the end of the VGG network? I’m probably missing something :slight_smile:

Thanks in advance !


It is emphazied in the exercise too. See the subsection

4.1.1 - Make Generated Image G Match the Content of Image C

To choose a “middle” activation layer a^{[l]} :

You need the “generated” image G to have similar content as the input image C. Suppose you have chosen some layer’s activations to represent the content of an image.

  • In practice, you’ll get the most visually pleasing results if you choose a layer in the middle of the network–neither too shallow nor too deep. This ensures that the network detects both higher-level and lower-level features.
  • After you have finished this exercise, feel free to come back and experiment with using different layers to see how the results vary!

So, we can try to test the model with different layers for the content cost. I haven’t done it myself yet but I do expect different results.

Hope this helps

Update: most of what I said here is incorrect. It’s been a while since I had looked at that code. It turns out they append the content cost to the style cost array and just handle it differently. Sorry!

Henrikh has given the general explanation, but the point is you can use any layers you want and they have chosen to also include one of the later layers as well. Also notice that they are using a linear combination with a number of layers, each with its own (tunable) weight value. It’s interesting that they give that later layer a \lambda value of 1, so it has as much effect on the cost as the previous 5 put together. Hmmm. So you have quite a bit of flexibility in how you tweak the layers you are sampling here. Try some experiments and see how the output varies! E.g. try modifying the code not to include that block5_conv4 later layer or set its weight to 0 and see what happens.

I think my question has been missunderstood, let me explain:

I understand why the middle layer should be used but isn’t block5 near the end? That’s why I was asking the question. It seems strange to explain both in the lectures and in the programming assignment (as you pointed out) that one should use a middle layer and yet suggest using one towards the end…

The lambda weighting is used in the style cost, not in the content cost, I was asking about the content cost specifically.

Thank you

Oh, sorry, you’re right. It took me a while to find the time to get back and read the notebook instructions and the code again carefully. You’re right: it’s the content cost for which they are using a later layer. But I don’t have an answer for why that differs from what they suggested in the instructions in the content cost section. I can only guess that using a middle layer didn’t work out as well as they had hoped. You can try some experiments. Or maybe even go back and look at the original paper and see if they give any different guidance there.