Durbin Watson Statistic

Durbin Watson Statistic

The Durbin Watson (DW) statistic is a test for autocorrelation in the residuals from a statistical model or regression analysis. For example, if you know that a stock historically has a high positive autocorrelation value and you witnessed the stock making solid gains over the past several days, then you might reasonably expect the movements over the upcoming several days (the leading time series) to match those of the lagging time series and to move upward. A stock price displaying positive autocorrelation would indicate that the price yesterday has a positive correlation on the price today — so if the stock fell yesterday, it is also likely that it falls today. The Durbin Watson (DW) statistic is a test for autocorrelation in the residuals from a statistical model or regression analysis. The Durbin Watson statistic is named after statisticians James Durbin and Geoffrey Watson.

The Durbin Watson statistic is a test for autocorrelation in a regression model's output.

What Is the Durbin Watson Statistic?

The Durbin Watson (DW) statistic is a test for autocorrelation in the residuals from a statistical model or regression analysis. The Durbin-Watson statistic will always have a value ranging between 0 and 4. A value of 2.0 indicates there is no autocorrelation detected in the sample. Values from 0 to less than 2 point to positive autocorrelation and values from 2 to 4 means negative autocorrelation.

A stock price displaying positive autocorrelation would indicate that the price yesterday has a positive correlation on the price today — so if the stock fell yesterday, it is also likely that it falls today. A security that has a negative autocorrelation, on the other hand, has a negative influence on itself over time — so that if it fell yesterday, there is a greater likelihood it will rise today.

The Durbin Watson statistic is a test for autocorrelation in a regression model's output.
The DW statistic ranges from zero to four, with a value of 2.0 indicating zero autocorrelation.
Values below 2.0 mean there is positive autocorrelation and above 2.0 indicates negative autocorrelation.
Autocorrelation can be useful in technical analysis, which is most concerned with the trends of security prices using charting techniques in lieu of a company's financial health or management.

The Basics of the Durbin Watson Statistic

Autocorrelation, also known as serial correlation, can be a significant problem in analyzing historical data if one does not know to look out for it. For instance, since stock prices tend not to change too radically from one day to another, the prices from one day to the next could potentially be highly correlated, even though there is little useful information in this observation. In order to avoid autocorrelation issues, the easiest solution in finance is to simply convert a series of historical prices into a series of percentage-price changes from day to day.

Autocorrelation can be useful for technical analysis, which is most concerned with the trends of, and relationships between, security prices using charting techniques in lieu of a company's financial health or management. Technical analysts can use autocorrelation to see how much of an impact past prices for a security have on its future price.

Autocorrelation can show if there is a momentum factor associated with a stock. For example, if you know that a stock historically has a high positive autocorrelation value and you witnessed the stock making solid gains over the past several days, then you might reasonably expect the movements over the upcoming several days (the leading time series) to match those of the lagging time series and to move upward.

The Durbin Watson statistic is named after statisticians James Durbin and Geoffrey Watson.

Special Considerations

A rule of thumb is that DW test statistic values in the range of 1.5 to 2.5 are relatively normal. Values outside this range could, however, be a cause for concern. The Durbin–Watson statistic, while displayed by many regression analysis programs, is not applicable in certain situations.

For instance, when lagged dependent variables are included in the explanatory variables, then it is inappropriate to use this test.

Example of the Durbin Watson Statistic

The formula for the Durbin Watson statistic is rather complex but involves the residuals from an ordinary least squares (OLS) regression on a set of data. The following example illustrates how to calculate this statistic.

Assume the following (x,y) data points:

Pair One = ( 10 , 1 , 100 ) Pair Two = ( 20 , 1 , 200 ) Pair Three = ( 35 , 985 ) Pair Four = ( 40 , 750 ) Pair Five = ( 50 , 1 , 215 ) Pair Six = ( 45 , 1 , 000 ) \begin{aligned} &\text{Pair One}=\left( {10}, {1,100} \right )\\ &\text{Pair Two}=\left( {20}, {1,200} \right )\\ &\text{Pair Three}=\left( {35}, {985} \right )\\ &\text{Pair Four}=\left( {40}, {750} \right )\\ &\text{Pair Five}=\left( {50}, {1,215} \right )\\ &\text{Pair Six}=\left( {45}, {1,000} \right )\\ \end{aligned} Pair One=(10,1,100)Pair Two=(20,1,200)Pair Three=(35,985)Pair Four=(40,750)Pair Five=(50,1,215)Pair Six=(45,1,000)

Using the methods of a least squares regression to find the "line of best fit," the equation for the best fit line of this data is:

Y = − 2.6268 x + 1 , 129.2 Y={-2.6268}x+{1,129.2} Y=−2.6268x+1,129.2

This first step in calculating the Durbin Watson statistic is to calculate the expected "y" values using the line of best fit equation. For this data set, the expected "y" values are:

Expected Y ( 1 ) = ( − 2.6268 × 10 ) + 1 , 129.2 = 1 , 102.9 Expected Y ( 2 ) = ( − 2.6268 × 20 ) + 1 , 129.2 = 1 , 076.7 Expected Y ( 3 ) = ( − 2.6268 × 35 ) + 1 , 129.2 = 1 , 037.3 Expected Y ( 4 ) = ( − 2.6268 × 40 ) + 1 , 129.2 = 1 , 024.1 Expected Y ( 5 ) = ( − 2.6268 × 50 ) + 1 , 129.2 = 997.9 Expected Y ( 6 ) = ( − 2.6268 × 45 ) + 1 , 129.2 = 1 , 011 \begin{aligned} &\text{Expected}Y\left({1}\right)=\left( -{2.6268}\times{10} \right )+{1,129.2}={1,102.9}\\ &\text{Expected}Y\left({2}\right)=\left( -{2.6268}\times{20} \right )+{1,129.2}={1,076.7}\\ &\text{Expected}Y\left({3}\right)=\left( -{2.6268}\times{35} \right )+{1,129.2}={1,037.3}\\ &\text{Expected}Y\left({4}\right)=\left( -{2.6268}\times{40} \right )+{1,129.2}={1,024.1}\\ &\text{Expected}Y\left({5}\right)=\left( -{2.6268}\times{50} \right )+{1,129.2}={997.9}\\ &\text{Expected}Y\left({6}\right)=\left( -{2.6268}\times{45} \right )+{1,129.2}={1,011}\\ \end{aligned} ExpectedY(1)=(−2.6268×10)+1,129.2=1,102.9ExpectedY(2)=(−2.6268×20)+1,129.2=1,076.7ExpectedY(3)=(−2.6268×35)+1,129.2=1,037.3ExpectedY(4)=(−2.6268×40)+1,129.2=1,024.1ExpectedY(5)=(−2.6268×50)+1,129.2=997.9ExpectedY(6)=(−2.6268×45)+1,129.2=1,011

Next, the differences of the actual "y" values versus the expected "y" values, the errors, are calculated:

Error ( 1 ) = ( 1 , 100 − 1 , 102.9 ) = − 2.9 Error ( 2 ) = ( 1 , 200 − 1 , 076.7 ) = 123.3 Error ( 3 ) = ( 985 − 1 , 037.3 ) = − 52.3 Error ( 4 ) = ( 750 − 1 , 024.1 ) = − 274.1 Error ( 5 ) = ( 1 , 215 − 997.9 ) = 217.1 Error ( 6 ) = ( 1 , 000 − 1 , 011 ) = − 11 \begin{aligned} &\text{Error}\left({1}\right)=\left( {1,100}-{1,102.9} \right )={-2.9}\\ &\text{Error}\left({2}\right)=\left( {1,200}-{1,076.7} \right )={123.3}\\ &\text{Error}\left({3}\right)=\left( {985}-{1,037.3} \right )={-52.3}\\ &\text{Error}\left({4}\right)=\left( {750}-{1,024.1} \right )={-274.1}\\ &\text{Error}\left({5}\right)=\left( {1,215}-{997.9} \right )={217.1}\\ &\text{Error}\left({6}\right)=\left( {1,000}-{1,011} \right )={-11}\\ \end{aligned} Error(1)=(1,100−1,102.9)=−2.9Error(2)=(1,200−1,076.7)=123.3Error(3)=(985−1,037.3)=−52.3Error(4)=(750−1,024.1)=−274.1Error(5)=(1,215−997.9)=217.1Error(6)=(1,000−1,011)=−11

Next these errors must be squared and summed:

Sum of Errors Squared = ( − 2.9 2 + 123.3 2 + − 52.3 2 + − 274.1 2 + 217.1 2 + − 11 2 ) = 140 , 330.81 \begin{aligned} &\text{Sum of Errors Squared =}\\ &\left({-2.9}^{2}+{123.3}^{2}+{-52.3}^{2}+{-274.1}^{2}+{217.1}^{2}+{-11}^{2}\right)= \\ &{140,330.81}\\ &\text{}\\ \end{aligned} Sum of Errors Squared =(−2.92+123.32+−52.32+−274.12+217.12+−112)=140,330.81

Next, the value of the error minus the previous error are calculated and squared:

Difference ( 1 ) = ( 123.3 − ( − 2.9 ) ) = 126.2 Difference ( 2 ) = ( − 52.3 − 123.3 ) = − 175.6 Difference ( 3 ) = ( − 274.1 − ( − 52.3 ) ) = − 221.9 Difference ( 4 ) = ( 217.1 − ( − 274.1 ) ) = 491.3 Difference ( 5 ) = ( − 11 − 217.1 ) = − 228.1 Sum of Differences Square = 389 , 406.71 \begin{aligned} &\text{Difference}\left({1}\right)=\left( {123.3}-\left({-2.9}\right) \right )={126.2}\\ &\text{Difference}\left({2}\right)=\left( {-52.3}-{123.3} \right )={-175.6}\\ &\text{Difference}\left({3}\right)=\left( {-274.1}-\left({-52.3}\right) \right )={-221.9}\\ &\text{Difference}\left({4}\right)=\left( {217.1}-\left({-274.1}\right) \right )={491.3}\\ &\text{Difference}\left({5}\right)=\left( {-11}-{217.1} \right )={-228.1}\\ &\text{Sum of Differences Square}={389,406.71}\\ \end{aligned} Difference(1)=(123.3−(−2.9))=126.2Difference(2)=(−52.3−123.3)=−175.6Difference(3)=(−274.1−(−52.3))=−221.9Difference(4)=(217.1−(−274.1))=491.3Difference(5)=(−11−217.1)=−228.1Sum of Differences Square=389,406.71

Finally, the Durbin Watson statistic is the quotient of the squared values:

Durbin Watson = 389 , 406.71 / 140 , 330.81 = 2.77 \text{Durbin Watson}={389,406.71}/{140,330.81}={2.77} Durbin Watson=389,406.71/140,330.81=2.77

Note: Tenths place may be off due to rounding errors in the squaring

Related terms:

Autocorrelation

Autocorrelation shows the degree of correlation between variables over successive time intervals.  read more

Business Valuation , Methods, & Examples

Business valuation is the process of estimating the value of a business or company. read more

Guppy Multiple Moving Average (GMMA)

The Guppy Multiple Moving Average (GMMA) is a technical indicator used to anticipate a breakout trend in the price of an asset.  read more

Joint Probability

Joint probability is a statistical measure that calculates the likelihood of two events occurring together and at the same point in time. Joint probability is the probability of event Y occurring at the same time that event X occurs. read more

Key Rate Duration

Key rate duration is a measure of the sensitivity of a security or the value of a portfolio to a 1% change in yield for a given maturity. read more

Least Squares Method

The least squares method is a statistical technique to determine the line of best fit for a model, specified by an equation with certain parameters to observed data.  read more

Line Of Best Fit

The line of best fit is an output of regression analysis that represents the relationship between two or more variables in a data set. read more

Regression

Regression is a statistical measurement that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables). read more

Residual Standard Deviation

The residual standard deviation describes the difference in standard deviations of observed values versus predicted values in a regression analysis. read more

Serial Correlation

Serial correlation is a statistical representation of the degree of similarity between a given time series and a lagged version of itself over successive time intervals read more