Input normalisation in case of small standard deviation

Hello,
While training my model i see weights number exploding. I’m using Leaky-Relu activation function on a regression problem.
I normalised both input and output.
When trying to investigate why i see that input is quite small (naturally bound between -1 and 1) and a small std, usually around 0.2-0.3. So after normalisation the input get’s scaled up quite a lot. Variance clearly is still 1 but i’m concerned with the single “spikes”.
Is this a real issue? Is there a rule for normalising very correlated inputs?
Thanks
Riccardo

Hello Riccardo! Interesting post.

  1. Is gradient exploding with unnormalized data?
  2. Is gradient exploding with normalized input only (without normalizing output)?

Best,
Saif.

Hello,

  1. I tried just removing the mean to the inut and no sigma scaling and things improve. It doesn’t explode anymore. To clarify it was exploding on a network with 8 hidden layers and about 900 nodes per layer, it was not doing it on smaller NN. Still though the NN is not learning, i.e. the training loss decreases until a threshold no matter how many iterations or NN size. This why i started investigating this but might be another issue.
  2. yes it does, even worse.

Thanks
Riccardo