Normalization meaningfully impacts lambda, right?

In the lecture for “Neural Style Transfer - Style Cost Function,” Ng states a couple times that the normalization factor in the correlation doesn’t matter that much because we tune a parameter we multiply by it anyway. But intuitively, it seems that in the case of lambda, the opposite is true. The choice of lambdas (there is a chosen hyperparameter lambda value for each layer) is intended to dictate the emphasis given to shallower or deeper features in a net when judging style transfer.

If we want a set of lambda values to have intuitive or generalizable meaning, wouldn’t we want them to apply to values which have been “properly” normalized? Otherwise the values of lambda wind up being heavily noised as they must account for our lack of proper normalization at each layer. Correct?

On the other hand, if lambdas are not intended or expected to have human-understandable meaning, then normalization truly doesn’t matter. Am I correct in interpreting these lectures as implying that lambdas should be comprehensible in nature?

Hey @am003e,

Your above statement is completely true. However, as far as my understanding goes, if you have a knowledge of the normalization factor for each of the layers, then it doesn’t matter, whether you include it explicitly as a normalization factor, or include it implicitly in the \lambda, and by definition, since you know the network, you can easily get to know the normalization factor for each of the layers. I guess this is what Prof Andrew meant to say.

But nonetheless, in my opinion, it’s more simpler to include the normalization factor, and then as you stated, allow the \lambda to be more comprehensive. I hope this helps.


1 Like