Normalization meaningfully impacts lambda, right?

Hey @am003e,

Your above statement is completely true. However, as far as my understanding goes, if you have a knowledge of the normalization factor for each of the layers, then it doesn’t matter, whether you include it explicitly as a normalization factor, or include it implicitly in the \lambda, and by definition, since you know the network, you can easily get to know the normalization factor for each of the layers. I guess this is what Prof Andrew meant to say.

But nonetheless, in my opinion, it’s more simpler to include the normalization factor, and then as you stated, allow the \lambda to be more comprehensive. I hope this helps.

Cheers,
Elemento

1 Like