Defining Nonlinear Regression

Defining Nonlinear Regression

Nonlinear regression is a form of regression analysis in which data is fit to a model and then expressed as a mathematical function. Likewise, it’s possible to use algebra to transform a nonlinear equation so that mimics a linear equation — such a nonlinear equation is referred to as “intrinsically linear.” Linear regression relates two variables with a straight line; nonlinear regression relates the variables using a curve. One example of how nonlinear regression can be used is to predict population growth over time. Simple linear regression relates two variables (X and Y) with a straight line (y = mx + b), while nonlinear regression relates the two variables in a nonlinear (curved) relationship. Nonlinear regression is a curved function of an X variable (or variables) that is used to predict a Y variable Nonlinear regression can show a prediction of population growth over time. A scatterplot of changing population data over time shows that there seems to be a relationship between time and population growth, but that it is a nonlinear relationship, requiring the use of a nonlinear regression model.

Both linear and nonlinear regression predict Y responses from an X variable (or variables).
Nonlinear regression is a form of regression analysis in which data is fit to a model and then expressed as a mathematical function. Simple linear regression relates two variables (X and Y) with a straight line (y = mx + b), while nonlinear regression relates the two variables in a nonlinear (curved) relationship.

The goal of the model is to make the sum of the squares as small as possible.  The sum of squares is a measure that tracks how far the Y observations vary from the nonlinear (curved) function that is used to predict Y.

It is computed by first finding the difference between the fitted nonlinear function and every Y point of data in the set. Then, each of those differences is squared. Lastly, all of the squared figures are added together. The smaller the sum of these squared figures, the better the function fits the data points in the set. Nonlinear regression uses logarithmic functions, trigonometric functions, exponential functions, power functions, Lorenz curves, Gaussian functions, and other fitting methods.

Nonlinear regression modeling is similar to linear regression modeling in that both seek to track a particular response from a set of variables graphically. Nonlinear models are more complicated than linear models to develop because the function is created through a series of approximations (iterations) that may stem from trial-and-error. Mathematicians use several established methods, such as the Gauss-Newton method and the Levenberg-Marquardt method.

Often, regression models that appear nonlinear upon first glance are actually linear. The curve estimation procedure can be used to identify the nature of the functional relationships at play in your data, so you can choose the correct regression model, whether linear or nonlinear. Linear regression models, while they typically form a straight line, can also form curves, depending on the form of the linear regression equation. Likewise, it’s possible to use algebra to transform a nonlinear equation so that mimics a linear equation — such a nonlinear equation is referred to as “intrinsically linear.”

Linear regression relates two variables with a straight line; nonlinear regression relates the variables using a curve.

Both linear and nonlinear regression predict Y responses from an X variable (or variables).
Nonlinear regression is a curved function of an X variable (or variables) that is used to predict a Y variable
Nonlinear regression can show a prediction of population growth over time.

Example of Nonlinear Regression

One example of how nonlinear regression can be used is to predict population growth over time. A scatterplot of changing population data over time shows that there seems to be a relationship between time and population growth, but that it is a nonlinear relationship, requiring the use of a nonlinear regression model. A logistic population growth model can provide estimates of the population for periods that were not measured, and predictions of future population growth.

Independent and dependent variables used in nonlinear regression should be quantitative. Categorical variables, like region of residence or religion, should be coded as binary variables or other types of quantitative variables.

In order to obtain accurate results from the nonlinear regression model, you should make sure the function you specify describes the relationship between the independent and dependent variables accurately. Good starting values are also necessary. Poor starting values may result in a model that fails to converge, or a solution that is only optimal locally, rather than globally, even if you’ve specified the right functional form for the model.

Related terms:

Error Term

An error term is a variable in a statistical model when the model doesn't represent the actual relationship between the independent and dependent variables. read more

Hedonic Regression

Hedonic regression applies regression analysis to estimate the relative impact of the variables that affect the price of a good or service. read more

Least Squares Criterion

The least-squares criterion is a method of measuring the accuracy of a line in depicting the data that was used to generate it. That is, the formula determines the line of best fit. read more

Multiple Linear Regression (MLR)

Multiple linear regression (MLR) is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. read more

Regression

Regression is a statistical measurement that attempts to determine the strength of the relationship between one dependent variable (usually denoted by Y) and a series of other changing variables (known as independent variables). read more

Residual Sum of Squares (RSS)

The residual sum of squares (RSS) is a statistical technique used to measure the variance in a data set that is not explained by the regression model. read more

Sum of Squares

Sum of squares is a statistical technique used in regression analysis to determine the dispersion of data points from their mean value. In a regression analysis, the goal is to determine how well a data series can be fitted to a function that might help to explain how the data series was generated. read more