Statistical model validation


Statistical model validation

Model validation is possibly the most important step in the model building sequence. It is also one of the most overlooked. Often the validation of a model seems to consist of nothing more than quoting the "R"2 statistic from the fit (which measures the fraction of the total variability in the response that is accounted for by the model).

"R"2 is not enough!

Unfortunately, a high "R"2 (coefficient of determination) value does not guarantee that the model fits the data well. Use of a model that does not fit the data well cannot provide good answers to the underlying engineering or scientific questions under investigation.

Analysis of residuals

The residuals from a fitted model are the differences between the responses observed at each combination values of the explanatory variables and the corresponding prediction of the response computed using the regression function. Mathematically, the definition of the residual for the "i"th observation in the data set is written:e_i = y_i - f(vec{x}_i;vec{hat{eta),with "yi" denoting the "i"th response in the data set and vec{x}_i the list of explanatory variables, each set at the corresponding values found in the "i"th observation in the data set.

If the model fit to the data were correct, the residuals would approximate the random errors that make the relationship between the explanatory variables and the response variable a statistical relationship. Therefore, if the residuals appear to behave randomly, it suggests that the model fits the data well. On the other hand, if non-random structure is evident in the residuals, it is a clear sign that the model fits the data poorly. The next section details the types of plots to use to test different aspects of a model and give guidance on the correct interpretations of different results that could be observed for each type of plot.

Graphical analysis of residuals

There are many statistical tools for model validation, but the primary tool for most modeling applications is graphical residual analysis. Different types of plots of the residuals from a fitted model provide information on the adequacy of different aspects of the model.
#sufficiency of the functional part of the model: scatter plots of residuals versus predictors
#non-constant variation across the data: scatter plots of residuals versus predictors; for data collected over time, also plots of residuals against time
#drift in the errors (data collected over time): run charts of the response and errors versus time
#independence of errors: lag plot
#normality of errors: histogram and normal probability plotGraphical methods have an advantage over numerical methods for model validation because they readily illustrate a broad range of complex aspects of the relationship between the model and the data.

Quantitative analysis of residuals

Numerical methods for model validation, such as the "R"2 statistic, are also useful, but usually to a lesser degree than graphical methods. Numerical methods for model validation tend to be narrowly focused on a particular aspect of the relationship between the model and the data and often try to compress that information into a single descriptive number or test result. Numerical methods do play an important role as confirmatory methods for graphical techniques, however. For example, the lack-of-fit test for assessing the correctness of the functional part of the model can aid in interpreting a borderline residual plot. There are also a few modeling situations in which graphical methods cannot easily be used. In these cases, numerical methods provide a fallback position for model validation. One common situation when numerical validation methods take precedence over graphical methods is when the number of parameters being estimated is relatively close to the size of the data set. In this situation residual plots are often difficult to interpret due to constraints on the residuals imposed by the estimation of the unknown parameters. One area in which this typically happens is in optimization applications using designed experiments. Logistic regression with binary data is another area in which graphical residual analysis can be difficult.

ee also

*Cross-validation

External links

* [http://www.itl.nist.gov/div898/handbook/pmd/section4/pmd44.htm How can I tell if a model fits my data?]

References


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Regression model validation — In statistics, model validation is possibly the most important step in the model building sequence. It is also one of the most overlooked.[citation needed] Often the validation of a model seems to consist of nothing more than quoting the R2… …   Wikipedia

  • Validation — The word validation has several uses: * In common usage, validation is the process of checking if something satisfies a certain criterion. Examples would include checking if a statement is true (validity), if an appliance works as intended, if a… …   Wikipedia

  • Statistical graphics — thumb|240px|John Snow s Cholera map in dot style, 1854.Statistical graphics, also known as graphical techniques, are information graphics in the field of statistics used to visualize quantitative data. Overview Statistics and data analysis… …   Wikipedia

  • Model selection — is the task of selecting a statistical model from a set of candidate models, given data. In the simplest cases, a pre existing set of data is considered. However, the task can also involve the design of experiments such that the data collected is …   Wikipedia

  • Model-based testing — is the application of Model based design for designing and optionally executing the necessary artifacts to perform software testing. Models can be used to represent the desired behavior of the System Under Test (SUT), or to represent the desired… …   Wikipedia

  • Statistical inference — In statistics, statistical inference is the process of drawing conclusions from data that are subject to random variation, for example, observational errors or sampling variation.[1] More substantially, the terms statistical inference,… …   Wikipedia

  • Statistical hypothesis testing — This article is about frequentist hypothesis testing which is taught in introductory statistics. For Bayesian hypothesis testing, see Bayesian inference. A statistical hypothesis test is a method of making decisions using data, whether from a… …   Wikipedia

  • Statistical dispersion — In statistics, statistical dispersion (also called statistical variability or variation) is variability or spread in a variable or a probability distribution. Common examples of measures of statistical dispersion are the variance, standard… …   Wikipedia

  • Cross-validation (statistics) — Cross validation, sometimes called rotation estimation,[1][2][3] is a technique for assessing how the results of a statistical analysis will generalize to an independent data set. It is mainly used in settings where the goal is prediction, and… …   Wikipedia

  • Conceptual model — For other uses, see Model (disambiguation). In the most general sense, a model is anything used in any way to represent anything else. Some models are physical objects, for instance, a toy model which may be assembled, and may even be made to… …   Wikipedia