Deviance information criterion

Deviance information criterion

The deviance information criterion (DIC) is a hierarchical modeling generalization of the AIC (Akaike information criterion) and BIC (Bayesian information criterion, also known as the Schwarz criterion). It is particularly useful in Bayesian model selection problems where the posterior distributions of the models have been obtained by Markov chain Monte Carlo (MCMC) simulation. Like AIC and BIC it is an asymptotic approximation as the sample size becomes large. It is only valid when the posterior distribution is approximately multivariate normal.

Define the deviance as  D(\theta)=-2 \log(p(y|\theta))+C\, , where y\, are the data, \theta\, are the unknown parameters of the model and  p(y|\theta)\, is the likelihood function. C\, is a constant that cancels out in all calculations that compare different models, and which therefore does not need to be known.

The expectation \bar{D}=\mathbf{E}^\theta[D(\theta)] is a measure of how well the model fits the data; the larger this is, the worse the fit.

The effective number of parameters of the model is computed as p_D=\bar{D}-D(\bar{\theta}), where \bar{\theta} is the expectation of \theta\,. The larger this is, the easier it is for the model to fit the data.

The deviance information criterion is calculated as

\mathit{DIC} = p_D+\bar{D}.

The idea is that models with smaller DIC should be preferred to models with larger DIC. Models are penalized both by the value of \bar{D}, which favors a good fit, but also (in common with AIC and BIC) by the effective number of parameters p_D\,. Since  \bar D will decrease as the number of parameters in a model increases, the p_D\, term compensates for this effect by favoring models with a smaller number of parameters.

The advantage of DIC over other criteria in the case of Bayesian model selection is that the DIC is easily calculated from the samples generated by a Markov chain Monte Carlo simulation. AIC and BIC require calculating the likelihood at its maximum over \theta\,, which is not readily available from the MCMC simulation. But to calculate DIC, simply compute \bar{D} as the average of D(\theta)\, over the samples of \theta\,, and D(\bar{\theta}) as the value of D\, evaluated at the average of the samples of  \theta\, . Then the DIC follows directly from these approximations. Claeskens and Hjort (2008, Ch. 3.5) show that the DIC is large-sample equivalent to the natural model-robust version of the AIC.

In the derivation of DIC, it assumed that the specified parametric family of probability distributions that generate future observations encompasses the true model. This assumption does not always hold, and it is desirable to consider model assessment procedures in that scenario. Also, the observed data are used both to construct the posterior distribution and to evaluate the estimated models. Therefore, DIC tends to select over-fitted models. Recently, these issues are resolved by Ando (2007), Bayesian predictive information criterion, BPIC.

See also


Wikimedia Foundation. 2010.

См. также в других словарях:

  • Deviance Information Criterion — In der Statistik ist das Abweichungsinformationskriterium (engl. deviance information criterion, DIC) ein Maß (Kriterium) für den Vorhersagefehler eines Modells. Diese Maßzahl ist ein Informationskriterium und gehört in das Umfeld der… …   Deutsch Wikipedia

  • Akaike Information Criterion — Ein Informationskriterium ist ein Kriterium zur Auswahl eines Modells in der angewandten Statistik bzw. der Ökonometrie. Dabei gehen die Anpassungsgüte des geschätzten Modells an die vorliegenden empirischen Daten (Stichprobe) und Komplexität des …   Deutsch Wikipedia

  • Akaike information criterion — Akaike s information criterion, developed by Hirotsugu Akaike under the name of an information criterion (AIC) in 1971 and proposed in Akaike (1974), is a measure of the goodness of fit of an estimated statistical model. It is grounded in the… …   Wikipedia

  • Bayesian information criterion — In statistics, in order to describe a particular dataset, one can use non parametric methods or parametric methods. In parametric methods, there might be various candidate models with different number of parameters to represent a dataset. The… …   Wikipedia

  • Deviance (statistics) — In statistics, deviance is a quality of fit statistic for a model that is often used for statistical hypothesis testing. The deviance for a model M0 is defined as Here denotes the fitted values of the parameters in the model M0, while denotes the …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Rasoir d'Occam — Rasoir d Ockham Pour les articles homonymes, voir Rasoir d Occam (homonymie). Le rasoir d’Occam ou rasoir d’Ockham est un principe de raisonnement que l on attribue au frère franciscain et philosophe Guillaume d Ockham (XIVe siècle), mais… …   Wikipédia en Français

  • Rasoir d'Ockham — Pour les articles homonymes, voir Rasoir d Occam (homonymie). Il est possible de décrire le soleil et les planètes comme étant en orbite autour de la te …   Wikipédia en Français

  • Rasoir d’Ockham — Rasoir d Ockham Pour les articles homonymes, voir Rasoir d Occam (homonymie). Le rasoir d’Occam ou rasoir d’Ockham est un principe de raisonnement que l on attribue au frère franciscain et philosophe Guillaume d Ockham (XIVe siècle), mais… …   Wikipédia en Français

  • Determining the number of clusters in a data set — Determining the number of clusters in a data set, a quantity often labeled k as in the k means algorithm, is a frequent problem in data clustering, and is a distinct issue from the process of actually solving the clustering problem. For a certain …   Wikipedia

Поделиться ссылкой на выделенное

Прямая ссылка:
Нажмите правой клавишей мыши и выберите «Копировать ссылку»