- Bayesian model comparison
A common problem in
statistical inferenceis to use data to decide between two or more competing models. Frequentist statistics uses hypothesis tests for this purpose. There are several Bayesianapproaches. One approach is through Bayes factors.
The posterior probability of a model given data, , is given by
The key data-dependent term is a likelihood, and is sometimes called the evidence for model "H"; evaluating it correctly is the key to Bayesian model comparison. The evidence is usually the
normalizing constantor partition functionof another inference, namely the inference of the parameters of model "H" given the data "D".
The plausibility of two different models "H"1 and "H"2, parametrised by model parameter vectors and is assessed by the
Bayes factorgiven by
Thus the Bayesian model comparison does not depend on the parameters used by each model. Instead, it considers the probability of the model considering all possible parameter values. Alternatively, the
Maximum likelihood estimatecould be used for each of the parameters.
An advantage of the use of
Bayes factors is that it automatically, and quite naturally, includes a penalty for including too much model structure. It thus guards against overfitting.
Other approaches are:
* to treat model comparison as a decision problem, computing the expected value or cost of each model choice;
* to use
Minimum Message Length(MML).
Nested sampling algorithm
Akaike information criterion
Bayesian information criterion
Conditional predictive ordinate
Deviance information criterion
Minimum Message Length(MML)
* Gelman, A., Carlin, J.,Stern, H. and Rubin, D. Bayesian Data Analysis. Chapman and Hall/CRC.(1995)
* Bernardo, J., and Smith, A.F.M., Bayesian Theory. John Wiley. (1994)
* Lee, P.M. Bayesian Statistics. Arnold.(1989).
* Denison, D.G.T., Holmes, C.C., Mallick, B.K., Smith, A.F.M., Bayesian Methods for Nonlinear Classification and Regression. John Wiley. (2002).
* Richard O. Duda, Peter E. Hart, David G. Stork (2000) "Pattern classification" (2nd edition), Section 9.6.5, p. 487-489, Wiley, ISBN 0-471-05669-3
* Chapter 24 in [http://omega.math.albany.edu:8008/JaynesBook.html Probability Theory - The logic of science] by E. T. Jaynes, 1994.
David J.C. MacKay(2003) Information theory, inference and learning algorithms, CUP, ISBN 0-521-64298-1, (also [http://www.inference.phy.cam.ac.uk/mackay/itila/book.html available online] )
* [http://www.inference.phy.cam.ac.uk/mackay/itila/ The on-line textbook: Information Theory, Inference, and Learning Algorithms] , by
David J.C. MacKay, discusses Bayesian model comparison in Chapter 28, p343.
Wikimedia Foundation. 2010.
Look at other dictionaries:
Bayesian inference — is statistical inference in which evidence or observations are used to update or to newly infer the probability that a hypothesis may be true. The name Bayesian comes from the frequent use of Bayes theorem in the inference process. Bayes theorem… … Wikipedia
Bayesian — refers to methods in probability and statistics named after the Reverend Thomas Bayes (ca. 1702 ndash;1761), in particular methods related to: * the degree of belief interpretation of probability, as opposed to frequency or proportion or… … Wikipedia
Bayesian information criterion — In statistics, in order to describe a particular dataset, one can use non parametric methods or parametric methods. In parametric methods, there might be various candidate models with different number of parameters to represent a dataset. The… … Wikipedia
Bayesian experimental design — provides a general probability theoretical framework from which other theories on experimental design can be derived. It is based on Bayesian inference to interpret the observations/data acquired during the experiment. This allows accounting for… … Wikipedia
Comparison of statistics journals — This is a comparison of peer reviewed scientific journals published in the field of statistics. Contents 1 General information 2 Impact, indexing, abstracting and reviewing 3 Notes 4 … Wikipedia
Comparison of general and generalized linear models — General linear model Generalized linear model Typical estimation method Least squares, best linear unbiased prediction Maximum likelihood or Bayesian Special cases ANOVA, ANCOVA, MANOVA, MANCOVA, ordinary linear regression, mixed model, t test, F … Wikipedia
General linear model — Not to be confused with generalized linear model. The general linear model (GLM) is a statistical linear model. It may be written as where Y is a matrix with series of multivariate measurements, X is a matrix that might be a design matrix, B… … Wikipedia
Bag of words model in computer vision — This is an article introducing the Bag of words model (BoW) in computer vision, especially for object categorization. From now, the BoW model refers to the BoW model in computer vision unless explicitly declared.Before introducing the BoW model,… … Wikipedia
List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… … Wikipedia
Minimum message length — (MML) is a formal information theory restatement of Occam s Razor: even when models are not equal in goodness of fit accuracy to the observed data, the one generating the shortest overall message is more likely to be correct (where the message… … Wikipedia