
Autoregressive Integrated Moving Average (ARIMA)
An autoregressive integrated moving average, or ARIMA, is a statistical analysis model that uses time series data to either better understand the data set or to predict future trends. A statistical model is autoregressive if it predicts future values based on past values. An ARIMA model can be understood by outlining each of its components as follows: _Autoregression (AR)_: refers to a model that shows a changing variable that regresses on its own lagged, or prior, values. _Integrated (I):_ represents the differencing of raw observations to allow for the time series to become stationary (i.e., data values are replaced by the difference between the data values and the previous values). _Moving average (MA)_: incorporates the dependency between an observation and a residual error from a moving average model applied to lagged observations. An autoregressive integrated moving average, or ARIMA, is a statistical analysis model that uses time series data to either better understand the data set or to predict future trends. A statistical model is autoregressive if it predicts future values based on past values. For ARIMA models, a standard notation would be ARIMA with p, d, and q, where integer values substitute for the parameters to indicate the type of ARIMA model used. Autoregressive integrated moving average (ARIMA) models predict future values based on past values.

What Is an Autoregressive Integrated Moving Average (ARIMA)?
An autoregressive integrated moving average, or ARIMA, is a statistical analysis model that uses time series data to either better understand the data set or to predict future trends.
A statistical model is autoregressive if it predicts future values based on past values. For example, an ARIMA model might seek to predict a stock's future prices based on its past performance or forecast a company's earnings based on past periods.





Understanding Autoregressive Integrated Moving Average (ARIMA)
An autoregressive integrated moving average model is a form of regression analysis that gauges the strength of one dependent variable relative to other changing variables. The model's goal is to predict future securities or financial market moves by examining the differences between values in the series instead of through actual values.
An ARIMA model can be understood by outlining each of its components as follows:
ARIMA Parameters
Each component in ARIMA functions as a parameter with a standard notation. For ARIMA models, a standard notation would be ARIMA with p, d, and q, where integer values substitute for the parameters to indicate the type of ARIMA model used. The parameters can be defined as:
In a linear regression model, for example, the number and type of terms are included. A 0 value, which can be used as a parameter, would mean that particular component should not be used in the model. This way, the ARIMA model can be constructed to perform the function of an ARMA model, or even simple AR, I, or MA models.
Because ARIMA models are complicated and work best on very large data sets, computer algorithms and machine learning techniques are used to compute them.
Autoregressive Integrated Moving Average (ARIMA) and Stationarity
In an autoregressive integrated moving average model, the data are differenced in order to make it stationary. A model that shows stationarity is one that shows there is constancy to the data over time. Most economic and market data show trends, so the purpose of differencing is to remove any trends or seasonal structures.
Seasonality, or when data show regular and predictable patterns that repeat over a calendar year, could negatively affect the regression model. If a trend appears and stationarity is not evident, many of the computations throughout the process cannot be made with great efficacy.
A one-time shock will affect subsequent values of an ARIMA model infinitely into the future. Therefore, the legacy of the financial crisis lives on in today’s autoregressive models.
Special Considerations
ARIMA models are based on the assumption that past values have some residual effect on current or future values. For example, an investor using an ARIMA model to forecast stock prices would assume that new buyers and sellers of that stock are influenced by recent market transactions when deciding how much to offer or accept for the security.
Although this assumption will hold under many circumstances, this is not always the case. For example, in the years prior to the 2008 Financial Crisis, most investors were not aware of the risks posed by the large portfolios of mortgage-backed securities (MBS) held by many financial firms.
During those times, an investor using an autoregressive model to predict the performance of U.S. financial stocks would have had good reason to predict an ongoing trend of stable or rising stock prices in that sector. However, once it became public knowledge that many financial institutions were at risk of imminent collapse, investors suddenly became less concerned with these stocks' recent prices and far more concerned with their underlying risk exposure. Therefore, the market rapidly revalued financial stocks to a much lower level, a move that would have utterly confounded an autoregressive model.
Frequently Asked Questions
What is ARIMA used for?
ARIMA is a method for forecasting or predicting future outcomes based on a historical time series. It is based on the statistical concept of serial correlation, where past data points influence future data points.
What are the differences between autoregressive and moving average models?
ARIMA combines autoregressive features with those of moving averages. An AR(1) autoregressive process, for instance, is one in which the current value is based on the immediately preceding value, while an AR(2) process is one in which the current value is based on the previous two values. A moving average is a calculation used to analyze data points by creating a series of averages of different subsets of the full data set in order to smooth out the influence of outliers. As a result of this combination of techniques, ARIMA models can take into account trends, cycles, seasonality, and other non-static types of data when making forecasts.
How does ARIMA forecasting work?
ARIMA forecasting is achieved by plugging in time series data for the variable of interest. Statistical software will identify the appropriate number of lags or amount of differencing to be applied to the data and check for stationarity. It will then output the results, which are often interpreted similarly to that of a multiple linear regression model.
Related terms:
Autoregressive Defined
A statistical model is autoregressive if it predicts future values based on past values (i.e., predicting future stock prices based on past performance). read more
Box-Jenkins Model
The Box-Jenkins Model is a mathematical model designed to forecast data from a specified time series. read more
Error Term
An error term is a variable in a statistical model when the model doesn't represent the actual relationship between the independent and dependent variables. read more
Generalized AutoRegressive Conditional Heteroskedasticity (GARCH)
Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) is a statistical model used to estimate the volatility of stock returns. read more
Mortgage-Backed Security (MBS)
A mortgage-backed security (MBS) is an investment similar to a bond that consists of a bundle of home loans bought from the banks that issued them. read more
Moving Average (MA)
A moving average (MA) is a technical analysis indicator that helps smooth out price action by filtering out the “noise” from random price fluctuations. read more
Defining Nonlinear Regression
Nonlinear regression is a form of regression analysis in which data fit to a model is expressed as a mathematical function. read more
Regression
Regression is a statistical measurement that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables). read more
Rescaled Range Analysis and Uses
Rescaled range analysis is used to calculate the Hurst exponent, which is a measure of the strength of time series trends and mean reversion. read more
Seasonality
Seasonality is a characteristic of a time series in which the data experiences regular and predictable changes that recur every calendar year. read more