# Normalization and Standardization in Data pre-processing

N

• What is Normalization and Standardization ?
• Why we need Normalization and Standardization ?
• Advantages of Normalization and standardization ?

What is Normalization ?

Normalization :- This is process is commonly termed as column normalization as it is mostly applied column wise or on every feature of a dataset. This is one of the important process in data pre-processing before applying any operation or algorithm.

Procedure: , where is a particular value in a feature ,and is minimum value of that column , is the maximum value in that column .After doing this for all values in each column of dataset will lie in range [0,1].

1. Scaling of all the values without destroying the relationship between data.
2. Getting rid of the calculation with very large values .

Geometric interpretation of Normalization:

What is Standardization

Standardization: It is a practice of making the mean of each column of data to zero and std-dev equal to1 . This is  common practice in data cleaning process.By applying standardization it makes  application of algorithm much more accurate.

The process of applying standardization:

Find :

• – Mean of the column on which standardization is to be applied
• – Std-dev of the column  on which standardization is to be applied

Then replace with ,where is:- = • Geometric interpretation of loss function gets more accurate
• the spread of the data is confined in range [-0.5,0.5].
• Squashing.

Geometric interpretation of Standardization:  By abhinavsinghml