Choosing activation layer representing the content image

In the video contents and last exercise of week 4, it is recommended to use a layer from somewhere in the middle of the network representing the content image to ensure that the generated image has similar content as the input image and to ‘usually get the most visually pleasing results’. However, in the train_step exercise, we compute the content cost by passing a_C and a_G as arguments to compute_content_cost which are the output of get_layer_outputs functions containing layers ‘block1_conv1’, ‘block2_conv1’, ‘block3_conv1’, ‘block4_conv1’, ‘block5_conv1’ and ‘block5_conv4’. My question is that in the training step how are we ensuring that we are using activations from a middle layer of the network during computation of the content cost?

Hi @Asmat_ullah1 ,

In the training step, we make sure that the content cost uses activations from a middle layer of the network by specifically selecting one of these layers, for the computation.

Although, the get_layer_outputs function retrieves activations from multiple layers, the content cost is calculated using activations from a designated middle layer, ensuring the generated image retains the input image’s content and achieves visually good results (done in implementation phase).

Hope this helps, feel free to ask if you need further assistance!

Thank you for your reply.

The compute_content_cost include the codes ‘a_C = content_output[-1]’. In my opinion this means we are selecting the last layer of the network output i.e. block5_conv4 for computation of the content cost and i think this is not the middle layer of the network.

1 Like