Alpha and Beta Hyperparameters in Neural Style Transfer

Hello,

In the lab “Art Generation with Neural Style Transfer,” the hyperparameters alpha and beta are used to weight the importance of the content and style costs, respectively. The initially specified values for these hyperparameters were alpha = 10 and beta = 40. When I tried swapping the values, the resulting images did not differ significantly in terms of representing the style and content images. What could be the reason?

Training hyperparameters for training session 1:
STYLE_LAYERS = [
(‘block1_conv1’, 0.2),
(‘block2_conv1’, 0.2),
(‘block3_conv1’, 0.2),
(‘block4_conv1’, 0.2),
(‘block5_conv1’, 0.2)]

alpha = 10, beta = 40

epochs = 20000

optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

Resulting image:

Training hyperparameters for training session 2:

STYLE_LAYERS = [
(‘block1_conv1’, 0.2),
(‘block2_conv1’, 0.2),
(‘block3_conv1’, 0.2),
(‘block4_conv1’, 0.2),
(‘block5_conv1’, 0.2)]

alpha = 40, beta = 10

epochs = 20000

optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

Resulting Image:

Thanks in advance!

But was there a change in the produced images from one setting to the other? I think it depends on how the content and style losses are sensitive to these values/magnitudes for alpha and beta!

2 Likes

That’s a really interesting question and experiment. Thanks for showing us your results. As they comment in the notebook, this is all subjective, meaning there isn’t really a right or wrong answer: it’s just a question of what you think looks pleasing.

I think there are some interesting differences between the two outputs. E.g. look at the mountain in both cases. This is (as mentioned above) largely subjective, but I’d say that the mountain looks a bit more “realistic” and more like the actual mountain from the content image in the second image. That’s the one in which you flipped the weights to put more emphasis on the content rather than the style. In the second image, the edges of the road are a bit better preserved as well. And (now getting further out on the edge of subjectivity), the trees on the left side of the road look a little bit more like trees, whereas in the 10, 40 image they aren’t really distinguishable as trees.

Maybe this is just “motivated reasoning” on my part, but based on the above I think you could justify saying that it does look like the content was emphasized more in the second output. :nerd_face:

Thanks again for doing the experiment and sharing your results, so that we can all learn from them!

1 Like

@gent.spah, @paulinpaloalto thanks for your attention to my question.

@paulinpaloalto I agree that the slightly noticeable difference you pointed out during the detailed inspection of the images does exist.

At the same time, I noticed that in the paper recommended in the lab (Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, (2015). A Neural Algorithm of Artistic Style p.12), the magnitude of the values for alpha and beta was much larger than those in the lab:

ratio

So I decided to apply the same approach and used values for alpha and beta that differ significantly.

Conditions 1:

alpha = 10000, beta = 1

STYLE_LAYERS = [
(‘block1_conv1’, 0.2),
(‘block2_conv1’, 0.2),
(‘block3_conv1’, 0.2),
(‘block4_conv1’, 0.2),
(‘block5_conv1’, 0.2)]

content_layer = [(‘block3_conv4’, 1)]

Resulting Image:

Conditions 2:

alpha = 1, beta = 10000

STYLE_LAYERS = [
(‘block1_conv1’, 0.2),
(‘block2_conv1’, 0.2),
(‘block3_conv1’, 0.2),
(‘block4_conv1’, 0.2),
(‘block5_conv1’, 0.2)]

content_layer = [(‘block3_conv4’, 1)]

Resulting Image 2:

@gent.spah now I understand your answer about the magnitudes of alpha and beta. With a significant difference in the magnitude of the alpha and beta parameters, their role becomes more obvious.

Dear mentors, thank you once again for your guidance.

1 Like

Hi, Veronika.

Thanks for sharing the results of your additional research! It’s a good point that maybe thinking of the balance between \alpha and \beta on a linear scale is the wrong way to look at it. When you convert to a logarithmic scale, the results are a lot clearer.

Or at least the \alpha = 10000 case is clear: you just get the content image apparently unmodified.

But I’d say that your image in the \alpha = 1 and \beta = 10000 case looks almost identical to the image from your original post in the \alpha = 40 and \beta = 10 case. That seems a little unexpected. It might be worth a look to confirm whether you see the same similarity.

2 Likes

Hi, Paul.

I generated images with alpha and beta settings of 10 and 40 once again. The outputs I obtained differed from those provided in the original post. I probably made an error in the code, although I tried to carefully monitor the changes I made to the code chunks containing my “independent variables” :slight_smile:

Conditions 1:

alpha = 10, beta = 40

content_layer = [(‘block3_conv4’, 1)]

STYLE_LAYERS = [
(‘block1_conv1’, 0.2),
(‘block2_conv1’, 0.2),
(‘block3_conv1’, 0.2),
(‘block4_conv1’, 0.2),
(‘block5_conv1’, 0.2)]

Conditions 2:

alpha = 40, beta = 10

content_layer = [(‘block3_conv4’, 1)]

STYLE_LAYERS = [
(‘block1_conv1’, 0.2),
(‘block2_conv1’, 0.2),
(‘block3_conv1’, 0.2),
(‘block4_conv1’, 0.2),
(‘block5_conv1’, 0.2)]

However, even in the latest results, a small difference in the alpha and beta values led to a smaller difference in the resulting images compared to the alpha-to-beta ratio of 1/10,000.

Thank you very much for your response.

1 Like

Hi, Veronika.

Thanks for closing the loop on this by rerunning the earlier cases. Yes! I agree that all the results now make sense and it is clear that going with the “logarithmic scale” for the ratio between \alpha and \beta is the better way to go.

Science! :nerd_face:

Regards,
Paul

1 Like