Rescaling methods for outlier data

Hi all,

I just finished the rescaling lesson, and I wanted to ask if there are ways to tackle data sets that contain data that isn’t neatly distributed?

If I were to use these rescaling methods with data that has outlier tails or is distributed in a non-normal fashion, It would be challenging to keep that rescaled data in a confined range.

I guess there are methods for normalizing data (not just rescaling it)?
Would we encounter them somewhere during the specialization?

Thanks for your help,
Yuri.

In practice we don’t really need a strictly confined range. Anything that gets the features into a zero-mean and a range of less than an order of magnitude will work fine (say between maybe -3 and +3).

Scaling by 1/ the standard deviation is a good choice - or if you know your data has a different distribution, you can use that instead.