Normal Distribution is a probability distribution, which is symmetric about the mean and shows that the data near the mean is most frequent than the data far from the mean. Normal Distribution appears as a bell curve in graphical form. This bell curve was discovered by Carl Friedrich Gauss, so sometimes we also call it a Gaussian Distribution. Also, for this distribution, the Mean, Median, and Mode are all equal.
Mean = Median = Mode
A normal distribution retains the normal shape throughout, unlike other probability distributions that change their properties after a transformation.
Source: https://www.scribbr.com/statistics/standard-normal-distribution/
According to the Empirical Rule for Normal Distribution:
- 68.2% of data lies within 1 standard deviation of the mean
- 95.4% of data lies within 2 standard deviations of the mean
- 99.7% of data lies within 3 standard deviations of the mean
Thus, almost all the data lies within 3 standard deviations. This rule enables us to check for Outliers and is very helpful when determining the normality of any distribution.
Skewness
With real-world datasets, mostly it is not the case that the distribution is symmetric. The distribution can have data points on one side more than the other, this phenomenon is called Skewness
Left skewed Distribution
In a left-skewed distribution, the data points are concentrated on the right side of the distribution, because of this the tail becomes longer on the left side. Since the tail is longer in the negative direction it is also known as Negatively Skewed Distribution.
In left Skewed Distribution,
Mode > Median > Mean
Source: https://corporatefinanceinstitute.com/resources/knowledge/other/negatively-skewed-distribution/
Right skewed Distribution
In a right-skewed distribution, the data points are concentrated on the left side of the distribution, because of this the tail becomes longer on the right side. Since the tail is longer in the positive direction it is also known as Positively Skewed Distribution.
In right Skewed Distribution,
Mode < Median < Mean
Source: https://corporatefinanceinstitute.com/resources/knowledge/other/positively-skewed-distribution/
Visualization:
One of the techniques to check the Normality of the data is the KDE (Kernel Density Estimation) method. In this method, a continuous curve is drawn at every data point, and the final plot shows a single smooth density estimation.
Python Code:
Source: https://seaborn.pydata.org/generated/seaborn.kdeplot