Central Limit Theorem (CLT)

Central Limit Theorem (CLT)

In probability theory, the central limit theorem (CLT) states that the distribution of a sample variable approximates a normal distribution (i.e., a “bell curve”) as the sample size becomes larger, assuming that all samples are identical in size, and regardless of the population's actual distribution shape. Put another way, CLT is a statistical premise that, given a sufficiently large sample size from a population with a finite level of variance, the mean of all sampled variables from the same population will be approximately equal to the mean of the whole population. In probability theory, the central limit theorem (CLT) states that the distribution of a sample variable approximates a normal distribution (i.e., a “bell curve”) as the sample size becomes larger, assuming that all samples are identical in size, and regardless of the population's actual distribution shape. Put another way, CLT is a statistical premise that, given a sufficiently large sample size from a population with a finite level of variance, the mean of all sampled variables from the same population will be approximately equal to the mean of the whole population. The central limit theorem (CLT) states that the distribution of sample means approximates a normal distribution as the sample size gets larger, regardless of the population's distribution. The central limit theorem is often used in conjunction with the law of large numbers, which states that the average of the sample means and standard deviations will come closer to equaling the population mean and standard deviation as the sample size grows, which is extremely useful in accurately predicting the characteristics of populations. According to the central limit theorem, the mean of a sample of data will be closer to the mean of the overall population in question, as the sample size increases, notwithstanding the actual distribution of the data.

The central limit theorem (CLT) states that the distribution of sample means approximates a normal distribution as the sample size gets larger, regardless of the population's distribution.

What Is the Central Limit Theorem (CLT)?

In probability theory, the central limit theorem (CLT) states that the distribution of a sample variable approximates a normal distribution (i.e., a “bell curve”) as the sample size becomes larger, assuming that all samples are identical in size, and regardless of the population's actual distribution shape.

Put another way, CLT is a statistical premise that, given a sufficiently large sample size from a population with a finite level of variance, the mean of all sampled variables from the same population will be approximately equal to the mean of the whole population. Furthermore, these samples approximate a normal distribution, with their variances being approximately equal to the variance of the population as the sample size gets larger, according to the law of large numbers.

Although this concept was first developed by Abraham de Moivre in 1733, it was not formalized until 1930, when noted Hungarian mathematician George Polya dubbed it the Central Limit Theorem.

The central limit theorem (CLT) states that the distribution of sample means approximates a normal distribution as the sample size gets larger, regardless of the population's distribution.
Sample sizes equal to or greater than 30 are often considered sufficient for the CLT to hold.
A key aspect of CLT is that the average of the sample means and standard deviations will equal the population mean and standard deviation.
A sufficiently large sample size can predict the characteristics of a population more accurately.

Understanding the Central Limit Theorem

According to the central limit theorem, the mean of a sample of data will be closer to the mean of the overall population in question, as the sample size increases, notwithstanding the actual distribution of the data. In other words, the data is accurate whether the distribution is normal or aberrant.

As a general rule, sample sizes of around 30-50 are deemed sufficient for the CLT to hold, meaning that the distribution of the sample means is fairly normally distributed. Therefore, the more samples one takes, the more the graphed results take the shape of a normal distribution. Note, however, that the central limit theory will still be approximated in many cases for much smaller sample sizes, such as n=8 or n=5.

The central limit theorem is often used in conjunction with the law of large numbers, which states that the average of the sample means and standard deviations will come closer to equaling the population mean and standard deviation as the sample size grows, which is extremely useful in accurately predicting the characteristics of populations.

Image

Sabrina Jiang / Investopedia

The Central Limit Theorem in Finance

The CLT is useful when examining the returns of an individual stock or broader indices, because the analysis is simple, due to the relative ease of generating the necessary financial data. Consequently, investors of all types rely on the CLT to analyze stock returns, construct portfolios, and manage risk.

Say, for example, an investor wishes to analyze the overall return for a stock index that comprises 1,000 equities. In this scenario, that investor may simply study a random sample of stocks to cultivate estimated returns of the total index. To be safe, use at least 30-50 randomly selected stocks across various sectors, should be sampled for the central limit theorem to hold. Furthermore, previously selected stocks must be swapped out with different names to help eliminate bias.

Related terms:

Business Valuation , Methods, & Examples

Business valuation is the process of estimating the value of a business or company. read more

Descriptive Statistics

Descriptive statistics is a set of brief descriptive coefficients that summarize a given data set representative of an entire or sample population. read more

Law Of Large Numbers

The law of large numbers, in probability and statistics, states that as a sample size grows, its mean gets closer to the average of the whole population. read more

Normal Distribution

Normal distribution is a continuous probability distribution wherein values lie in a symmetrical fashion mostly situated around the mean. read more

Sampling Distribution

A sampling distribution describes the data chosen for a sample from among a larger population. read more

Sampling Error

A sampling error is a statistical error that occurs when a sample does not represent the entire population. See how to avoid sampling errors in data analysis. read more

Statistics

Statistics is the collection, description, analysis, and inference of conclusions from quantitative data. read more

T-Test

A t-test is a type of inferential statistic used to determine if there is a significant difference between the means of two groups, which may be related in certain features. read more

T Distribution

A T distribution is a type of probability function that is appropriate for estimating population parameters for small sample sizes or unknown variances. read more

Variance , Formula, & Calculation

Variance is a measurement of the spread between numbers in a data set. Investors use the variance equation to evaluate a portfolio’s asset allocation. read more