Cramér–von Mises criterion

In statistics the Cramér–von Mises criterion is a criterion used for judging the goodness of fit of a cumulative distribution function $F *$ compared to a given empirical distribution function $F n$ , or for comparing two empirical distributions. It is also used as a part of other algorithms, such as minimum distance estimation. It is defined as

$\omega^2 = \int_{-\infty}^{\infty} [F_n(x)-F^*(x)]^2\,\mathrm{d}F^*(x)$

In one-sample applications $F *$ is the theoretical distribution and $F n$ is the empirically observed distribution. Alternatively the two distributions can both be empirically estimated ones; this is called the two-sample case.

The criterion is named after Harald Cramér and Richard Edler von Mises who first proposed it in 1928-1930. The generalization to two samples is due to Anderson.^[1]

The Cramér–von Mises test is an alternative to the Kolmogorov-Smirnov test.

1 Cramér–von Mises test (one sample)
- 1.1 Watson test
2 Cramér–von Mises test (two samples)
3 Notes
4 References
5 Further reading
6 External links

Cramér–von Mises test (one sample)

Let $x_1,x_2,\cdots,x_n$ be the observed values, in increasing order. Then the statistic is^[1]^:1153^[2]

$T = n \omega^2 = \frac{1}{12n} + \sum_{i=1}^n \left[ \frac{2i-1}{2n}-F(x_i) \right]^2.$

If this value is larger than the tabulated value the hypothesis that the data come from the distribution $F$ can be rejected.

Watson test

A modified version of the Cramér–von Mises test is the Watson test^[3] which uses the statistic U², where^[2]

$U^2= T-n( \bar{F}-\tfrac{1}{2} )^2,$

where

$\bar{F}=\frac{1}{n} \sum F(x_i).$

Cramér–von Mises test (two samples)

Let $x_1,x_2,\cdots,x_N$ and $y_1,y_2,\cdots,y_M$ be the observed values in the first and second sample respectively, in increasing order. Let $r_1,r_2,\cdots,r_N$ be the ranks of the x's in the combined sample, and let $s_1,s_2,\cdots,s_M$ be the ranks of the y's in the combined sample. Anderson^[1]^:1149 shows that

$T = N \omega^2 = \frac{U}{N M (N+M)}-\frac{4 M N - 1}{6(M+N)}$

where U is defined as

$U = N \sum_{i=1}^N (r_i-i)^2 + M \sum_{j=1}^M (s_j-j)^2$

If the value of T is larger than the tabulated values,^[1]^:1154–1159 the hypothesis that the two samples come from the same distribution can be rejected. (Some books^[specify] give critical values for U, which is more convenient, as it avoids the need to compute T via the expression above. The conclusion will be the same).

The above assumes there are no duplicates in the $x$ , $y$ , and $r$ sequences. So $x i$ is unique, and its rank is $i$ in the sorted list $x 1,... x N$ . If there are duplicates, and $x i$ through $x j$ are a run of identical values in the sorted list, then one common approach is the midrank ^[4] method: assign each duplicate a "rank" of $(i + j) / 2$ . In the above equations, in the expressions $(r i - i) 2$ and $(s j - j) 2$ , duplicates can modify all four variables $r i$ , $i$ , $s j$ , and $j$ .

Notes

^ ^a ^b ^c ^d Anderson (1962)
^ ^a ^b Pearson & Hartley (1972) p 118
^ Watson (1961)
^ Ruymgaart (1980)

References

Anderson, TW (1962). "On the Distribution of the Two-Sample Cramer–von Mises Criterion" (PDF). The Annals of Mathematical Statistics (Institute of Mathematical Statistics) 33 (3): 1148–1159. doi:10.1214/aoms/1177704477. ISSN 0003-4851. http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view=body&id=pdf_1&handle=euclid.aoms/1177704477. Retrieved June 12, 2009.
M. A. Stephens (1986). "Tests Based on EDF Statistics". In D'Agostino, R.B. and Stephens, M.A.. Goodness-of-Fit Techniques. New York: Marcel Dekker. ISBN 0-8247-7487-6.
Pearson, E.S., Hartley, H.O. (1972) Biometrika Tables for Statisticians, Volume 2, CUP. ISBN 0521069378 (page 118 and Table 54)
Ruymgaart, F. H., (1980) "A unified approach to the asymptotic distribution theory of certain midrank statistics". In: Statistique non Parametrique Asymptotique, 1±18, J. P. Raoult (Ed.), Lecture Notes on Mathematics, No. 821, Springer, Berlin.
Watson, G.S. (1961) "Goodness-Of-Fit Tests on a Circle", Biometrika, 48 (1/2), 109-114 JSTOR 2333135

External links

C-vM Two Sample Test (Documentation for performing the test using R
Table of Critical values for 1 sample CvM test

Categories:

Statistical tests
Statistical distance measures
Non-parametric statistics
Normality tests

Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

Cramér-von-Mises criterion — In statistics the Cramér von Mises criterion is a form of minimum distance estimation used for judging the goodness of fit of a probability distribution F^* compared to a given distribution F is given by:n W^2 = n int { infty}^{infty} [F(x)… … Wikipedia
Cramér-von-Mises-Test — Der Cramér von Mises Test ist ein statistischer Test, mit dem untersucht werden kann, ob die Häufigkeitsverteilung der Daten einer Stichprobe von einer vorgegebenen hypothetischen Wahrscheinlichkeitsverteilung abweicht (Ein Stichproben Fall),… … Deutsch Wikipedia
Richard von Mises — Infobox Scientist name = Richard von Mises box width = image width =150px caption = Richard von Mises birth date = 19 April 1883 birth place = Lemberg death date = 14 July 1953 death place = Boston residence = citizenship = nationality =… … Wikipedia
Harald Cramér — Born 25 September 1893(1893 09 25) Stockholm, Sweden … Wikipedia
Minimum distance estimation — (MDE) is a statistical method for fitting a mathematical model to data, usually the empirical distribution. Contents 1 Definition 2 Statistics used in estimation 2.1 Chi square criterion … Wikipedia
List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… … Wikipedia
List of mathematics articles (C) — NOTOC C C closed subgroup C minimal theory C normal subgroup C number C semiring C space C symmetry C* algebra C0 semigroup CA group Cabal (set theory) Cabibbo Kobayashi Maskawa matrix Cabinet projection Cable knot Cabri Geometry Cabtaxi number… … Wikipedia
Normality test — In statistics, normality tests are used to determine whether a data set is well modeled by a normal distribution or not, or to compute how likely an underlying random variable is to be normally distributed. More precisely, they are a form of… … Wikipedia
Test de normalité — En statistiques, les tests de normalité permettent de vérifier si des données réelles suivent une loi normale ou non. Les tests de normalité sont des cas particuliers des tests d adéquation (ou tests d ajustement, tests permettant de comparer des … Wikipédia en Français
Tests de normalité — Test de normalité En statistiques, les tests de normalité permettent de vérifier que des données réelles suivent une loi normale ou non. Les tests de normalité sont des cas particuliers des tests d adéquation (ou tests d ajustement, tests… … Wikipédia en Français

Academic Dictionaries and Encyclopedias

Cramér–von Mises criterion

Contents

Cramér–von Mises test (one sample)

Watson test

Cramér–von Mises test (two samples)

Notes

References

Further reading

External links

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

Cramér–von Mises criterion

Contents

Cramér–von Mises test (one sample)

Watson test

Cramér–von Mises test (two samples)

Notes

References

Further reading

External links

Look at other dictionaries:

Share the article and excerpts

Direct link