The Danger of Extrapolation in Regression Analysis

Regression analysis is a valuable tool in statistical analysis primarily because it allows analysts to predict, or regress as we prefer to call it, variables from sets of other variables. This method is one of the technique utilized in predictive analytics. Predictive analytics is a powerful arsenal to have in most scenarios as it allows users to envision what an outcome might be based on several inputs derived as a mathematical model.

As the prevalence of predictive analytics has seen a surge in popularity with the advent of things such as big data, I believe that the topic of extrapolation¬†merits some attention. Oxford Dictionaries defines extrapolation as “extend the application of (a method or conclusion) to an unknown situation by assuming that existing trends will continue or similar methods will be applicable” with assuming being the key word here. In mathematical terms, Wikipedia refers to¬†extrapolation as “the process of estimating, beyond the original observation interval, the value of a variable on the basis of its relationship with another variable” with beyond the original observation interval being the key phrase. Therein lies a problem.

