Residual Standard Deviation

Residual Standard Deviation

Residual standard deviation is a statistical term used to describe the difference in standard deviations of observed values versus predicted values as shown by points in a regression analysis. where: S r e s \= Residual standard deviation Y \= Observed value Y e s t \= Estimated or projected value n \= Data points in population \\begin{aligned} &\\text{Residual}=\\left(Y-Y\_{est}\\right)\\\\ &S\_{res}=\\sqrt{\\frac{\\sum \\left(Y-Y\_{est}\\right)^2}{n-2}}\\\\ &\\textbf{where:}\\\\ &S\_{res}=\\text{Residual standard deviation}\\\\ &Y=\\text{Observed value}\\\\ &Y\_{est}=\\text{Estimated or projected value}\\\\ &n=\\text{Data points in population}\\\\ \\end{aligned} Residual\=(Y−Yest)Sres\=n−2∑(Y−Yest)2where:Sres\=Residual standard deviationY\=Observed valueYest\=Estimated or projected valuen\=Data points in population To calculate the residual standard deviation, the difference between the predicted values and actual values formed around a fitted line must be calculated first. Residual standard deviation is also referred to as the standard deviation of points around a fitted line or the standard error of estimate. Residual standard deviation is the standard deviation of the residual values, or the difference between a set of observed and predicted values. For example, assuming you have a set of four observed values for an unnamed experiment, the table below shows y values observed and recorded for given values of x: If the linear equation or slope of the line predicted by the data in the model is given as yest = 1x + 2 where yest = predicted y value, the residual for each observation can be found. The residual is equal to (y - yest), so for the first set, the actual y value is 1 and the predicted yest value given by the equation is yest = 1(1) + 2 = 3. The result is used to measure the error of the regression line's predictability. The smaller the residual standard deviation is compared to the sample standard deviation, the more predictive, or useful, the model is. Residual standard deviation is a goodness-of-fit measure that can be used to analyze how well a set of data points fit with the actual model.

Residual standard deviation is the standard deviation of the residual values, or the difference between a set of observed and predicted values.

What Is Residual Standard Deviation?

Residual standard deviation is a statistical term used to describe the difference in standard deviations of observed values versus predicted values as shown by points in a regression analysis.

Regression analysis is a method used in statistics to show a relationship between two different variables, and to describe how well you can predict the behavior of one variable from the behavior of another.

Residual standard deviation is also referred to as the standard deviation of points around a fitted line or the standard error of estimate.

Residual standard deviation is the standard deviation of the residual values, or the difference between a set of observed and predicted values.
The standard deviation of the residuals calculates how much the data points spread around the regression line.
The result is used to measure the error of the regression line's predictability.
The smaller the residual standard deviation is compared to the sample standard deviation, the more predictive, or useful, the model is.

Understanding Residual Standard Deviation

Residual standard deviation is a goodness-of-fit measure that can be used to analyze how well a set of data points fit with the actual model. In a business setting for example, after performing a regression analysis on multiple data points of costs over time, the residual standard deviation can provide a business owner with information on the difference between actual costs and projected costs, and an idea of how much-projected costs could vary from the mean of the historical cost data.

Formula for Residual Standard Deviation

Residual = ( Y − Y e s t ) S r e s = ∑ ( Y − Y e s t ) 2 n − 2 where: S r e s = Residual standard deviation Y = Observed value Y e s t = Estimated or projected value n = Data points in population \begin{aligned} &\text{Residual}=\left(Y-Y_{est}\right)\\ &S_{res}=\sqrt{\frac{\sum \left(Y-Y_{est}\right)^2}{n-2}}\\ &\textbf{where:}\\ &S_{res}=\text{Residual standard deviation}\\ &Y=\text{Observed value}\\ &Y_{est}=\text{Estimated or projected value}\\ &n=\text{Data points in population}\\ \end{aligned} Residual=(Y−Yest)Sres=n−2∑(Y−Yest)2where:Sres=Residual standard deviationY=Observed valueYest=Estimated or projected valuen=Data points in population

How to Calculate Residual Standard Deviation

To calculate the residual standard deviation, the difference between the predicted values and actual values formed around a fitted line must be calculated first. This difference is known as the residual value or, simply, residuals or the distance between known data points and those data points predicted by the model.

To calculate the residual standard deviation, plug the residuals into the residual standard deviation equation to solve the formula.

Example of Residual Standard Deviation 

Start by calculating residual values. For example, assuming you have a set of four observed values for an unnamed experiment, the table below shows y values observed and recorded for given values of x:

If the linear equation or slope of the line predicted by the data in the model is given as yest = 1x + 2 where yest = predicted y value, the residual for each observation can be found.

The residual is equal to (y - yest), so for the first set, the actual y value is 1 and the predicted yest value given by the equation is yest = 1(1) + 2 = 3. The residual value is thus 1 – 3 = -2, a negative residual value.

For the second set of x and y data points, the predicted y value when x is 2 and y is 4 can be calculated as 1 (2) + 2 = 4.

In this case, the actual and predicted values are the same, so the residual value will be zero. You would use the same process for arriving at the predicted values for y in the remaining two data sets.

Once you’ve calculated the residuals for all points using the table or a graph, use the residual standard deviation formula.

Residual (y-yest)

Sum of each residual squared, or Σ(y-yest)2 

Observe that the sum of the squared residuals = 6, which represents the numerator of the residual standard deviation equation.

For the bottom portion or denominator of the residual standard deviation equation, n = the number of data points, which is 4 in this case. Calculate the denominator of the equation as:

Finally, calculate the square root of the results:

The magnitude of a typical residual can give you a sense of generally how close your estimates are. The smaller the residual standard deviation, the closer is the fit of the estimate to the actual data. In effect, the smaller the residual standard deviation is compared to the sample standard deviation, the more predictive, or useful, the model is.

The residual standard deviation can be calculated when a regression analysis has been performed, as well as an analysis of variance (ANOVA). When determining a limit of quantitation (LoQ), the use of a residual standard deviation is permissible instead of the standard deviation.

Related terms:

Business Valuation , Methods, & Examples

Business valuation is the process of estimating the value of a business or company. read more

Durbin Watson Statistic

The Durbin Watson statistic is a number that tests for autocorrelation in the residuals from a statistical regression analysis. read more

Goodness-of-Fit

A goodness-of-fit test helps you see if your sample data is accurate or somehow skewed. Discover how the popular chi-square goodness-of-fit test works. read more

Least Squares Method

The least squares method is a statistical technique to determine the line of best fit for a model, specified by an equation with certain parameters to observed data.  read more

Defining Nonlinear Regression

Nonlinear regression is a form of regression analysis in which data fit to a model is expressed as a mathematical function. read more

Regression

Regression is a statistical measurement that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables). read more

Residual Sum of Squares (RSS)

The residual sum of squares (RSS) is a statistical technique used to measure the variance in a data set that is not explained by the regression model. read more

Residual Value

Residual value is the estimated value of a fixed asset at the end of its lease term or useful life. See examples of how to calculate residual value. read more

Sample

A sample is a smaller, manageable version of a larger group. Samples are used in statistical testing when population sizes are too large. read more

Standard Error

The standard error is the standard deviation of a sample population. It measures the accuracy with which a sample represents a population. read more