
Heteroskedasticity
In statistics, heteroskedasticity (or heteroscedasticity) happens when the standard deviations of a predicted variable, monitored over different values of an independent variable or as related to prior time periods, are non-constant. For example, developers of the CAPM model were aware that their model failed to explain an interesting anomaly: high-quality stocks, which were less volatile than low-quality stocks, tended to perform better than the CAPM model predicted. In statistics, heteroskedasticity (or heteroscedasticity) happens when the standard deviations of a predicted variable, monitored over different values of an independent variable or as related to prior time periods, are non-constant. Unconditional heteroskedasticity refers to general structural changes in volatility that are not related to prior period volatility. Conditional heteroskedasticity identifies nonconstant volatility related to prior period's (e.g., daily) volatility.

What Is Heteroskedasticity?
In statistics, heteroskedasticity (or heteroscedasticity) happens when the standard deviations of a predicted variable, monitored over different values of an independent variable or as related to prior time periods, are non-constant. With heteroskedasticity, the tell-tale sign upon visual inspection of the residual errors is that they will tend to fan out over time, as depicted in the image below.
Heteroskedasticity often arises in two forms: conditional and unconditional. Conditional heteroskedasticity identifies nonconstant volatility related to prior period's (e.g., daily) volatility. Unconditional heteroskedasticity refers to general structural changes in volatility that are not related to prior period volatility. Unconditional heteroskedasticity is used when future periods of high and low volatility can be identified.
Image by Julie Bang © Investopedia 2019
While heteroskedasticity does not cause bias in the coefficient estimates, it does make them less precise; lower precision increases the likelihood that the coefficient estimates are further from the correct population value.



The Basics of Heteroskedasticity
In finance, conditional heteroskedasticity is often seen in the prices of stocks and bonds. The level of volatility of these equities cannot be predicted over any period. Unconditional heteroskedasticity can be used when discussing variables that have identifiable seasonal variability, such as electricity usage.
As it relates to statistics, heteroskedasticity (also spelled heteroscedasticity) refers to the error variance, or dependence of scattering, within a minimum of one independent variable within a particular sample. These variations can be used to calculate the margin of error between data sets, such as expected results and actual results, as it provides a measure of the deviation of data points from the mean value.
For a dataset to be considered relevant, the majority of the data points must be within a particular number of standard deviations from the mean as described by Chebyshev’s theorem, also known as Chebyshev’s inequality. This provides guidelines regarding the probability of a random variable differing from the mean.
Based on the number of standard deviations specified, a random variable has a particular probability of existing within those points. For example, it may be required that a range of two standard deviations contain at least 75% of the data points to be considered valid. A common cause of variances outside the minimum requirement is often attributed to issues of data quality.
The opposite of heteroskedastic is homoskedastic. Homoskedasticity refers to a condition in which the variance of the residual term is constant or nearly so. Homoskedasticity is one assumption of linear regression modeling. It is needed to ensure that the estimates are accurate, that the prediction limits for the dependent variable are valid, and that confidence intervals and p-values for the parameters are valid.
The Types Heteroskedasticity
Unconditional
Unconditional heteroskedasticity is predictable and can relate to variables that are cyclical by nature. This can include higher retail sales reported during the traditional holiday shopping period or the increase in air conditioner repair calls during warmer months.
Changes within the variance can be tied directly to the occurrence of particular events or predictive markers if the shifts are not traditionally seasonal. This can be related to an increase in smartphone sales with the release of a new model as the activity is cyclical based on the event but not necessarily determined by the season.
Heteroskedasticity can also relate to cases where the data approach a boundary — where the variance must necessarily be smaller because of the boundary's restricting the range of the data.
Conditional
Conditional heteroskedasticity is not predictable by nature. There is no telltale sign that leads analysts to believe data will become more or less scattered at any point in time. Often, financial products are considered subject to conditional heteroskedasticity as not all changes can be attributed to specific events or seasonal changes.
A common application of conditional heteroskedasticity is to stock markets, where the volatility today is strongly related to volatility yesterday. This model explains periods of persistent high volatility and low volatility.
Special Considerations
Heteroskedasticity and Financial Modeling
Heteroskedasticity is an important concept in regression modeling, and in the investment world, regression models are used to explain the performance of securities and investment portfolios. The most well-known of these is the Capital Asset Pricing Model (CAPM), which explains the performance of a stock in terms of its volatility relative to the market as a whole. Extensions of this model have added other predictor variables such as size, momentum, quality, and style (value versus growth).
These predictor variables have been added because they explain or account for variance in the dependent variable. Portfolio performance is explained by CAPM. For example, developers of the CAPM model were aware that their model failed to explain an interesting anomaly: high-quality stocks, which were less volatile than low-quality stocks, tended to perform better than the CAPM model predicted. CAPM says that higher-risk stocks should outperform lower-risk stocks.
In other words, high-volatility stocks should beat lower-volatility stocks. But high-quality stocks, which are less volatile, tended to perform better than predicted by CAPM.
Later, other researchers extended the CAPM model (which had already been extended to include other predictor variables such as size, style, and momentum) to include quality as an additional predictor variable, also known as a "factor." With this factor now included in the model, the performance anomaly of low volatility stocks was accounted for. These models, known as multi-factor models, form the basis of factor investing and smart beta.
Related terms:
Business Valuation , Methods, & Examples
Business valuation is the process of estimating the value of a business or company. read more
Capital Asset Pricing Model (CAPM)
The Capital Asset Pricing Model is a model that describes the relationship between risk and expected return. read more
Econometrics
Econometrics is the application of statistical and mathematical models to economic data for the purpose of testing theories, hypotheses, and future trends. read more
Error Term
An error term is a variable in a statistical model when the model doesn't represent the actual relationship between the independent and dependent variables. read more
Generalized AutoRegressive Conditional Heteroskedasticity (GARCH)
Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) is a statistical model used to estimate the volatility of stock returns. read more
Heteroskedastic
Heteroskedastic refers to a condition in which the variance of the residual term, or error term, in a regression model varies widely. read more
Homoskedastic
Homoskedastic refers to a condition in which the variance of the error term in a regression model is constant. read more
Multiple Linear Regression (MLR)
Multiple linear regression (MLR) is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. read more
Multi-Factor Model
A multi-factor model uses many factors in its computations to explain market phenomena and/or equilibrium asset prices. read more
Residual Sum of Squares (RSS)
The residual sum of squares (RSS) is a statistical technique used to measure the variance in a data set that is not explained by the regression model. read more