In the “Art Generation with Neural Style Transfer” programming assignment, we used an alpha and beta (the weight factors for content and style cost respectively) of 10 and 40, for a ratio of 1:4. In the original 2015 paper “A Neural Algorithm of Artistic Style” the authors used ratios of 1:1000 and 1:10000 to generate their main images, and didn’t showcase a ratio closer than 1:100.
A ratio of 1:4 worked well in the course assignment, is this because of a difference from the original algorithm, or has it been found more recently that a much closer ratio between alpha and beta works well?
It’s a good point that things are different than in the original paper. Another difference that was noted by student recently (and reported as an inconsistency between the lectures and the assignment) that the scaling factor on the Content Cost function is shown as \frac {1}{2} times the squared norm of the difference, which is also what it was in the paper. In the assignment they use \frac {1}{4 * n_H * n_W * n_C}, which will give a much smaller number of course. Perhaps that is connected to the other difference that you note.
But in general, the results here are a lot more subjective than in a typical ML problem. The question is what looks pleasing to your eyes and perhaps they ran some more experiments and preferred the results they got with the modified formulation that they actually used. Of course there may be other considerations about how many iterations you have to run to see “reasonable” results. In the notebook context, they may have made the tradeoffs differently because they can’t afford the resources to run more iterations of training. I’m just speculating here, as I have not tried any experiments with the NST code.
If you’re curious, you could try adjusting the scaling factors to be the ones in the paper and see how that affects the results. Let us know if you try this and discover anything interesting.