Scaler and Transformer

Why Scaling?

Scaling Sensitive Models

Scaling Insensitive Models

Load Raw Data

Normalization

$$𝑥_{𝑖}=\frac{𝑥_{𝑖}−𝑚𝑖𝑛(𝑥_{𝑖})}{𝑚𝑎𝑥(𝑥_{𝑖})−𝑚𝑖𝑛(𝑥_{𝑖})}$$

Pros:

* All features will have the exact same scale
* Preserves the relationships among the original data values
* End up with smaller standard deviations, which suppresses the effect of outliers
* Responds well if the standard deviation is small and when a distribution is not Gaussian

Cons:

* Does not handle outliers very well
* Normalized data may not meet the need of some models, e.g., KDE

Standardization (Zero-mean normalization)

$$𝑥_{𝑖}=\frac{𝑥𝑖−𝑚𝑒𝑎𝑛(𝑥_{𝑖})}{𝑠𝑡𝑑(𝑥_{𝑖})}$$

Max Abs Scaler

$$x_{i} = \frac{x_{i}}{|x_{max}|}$$

Robust Scaler

$$x_{i} = \frac{x_{i}-Q_{1}(x)}{Q_{1}(x)-Q_{3}(x)}$$

Quantile Transformer Scaler (Rank scaler)

Power Transformer Scaler

Unit Vector Scaler

Unit Norm Scaler

K-Bins Discretizations

Feature Binarization

Function Transformers

Reference