Neyman-Pearson lemma


Neyman-Pearson lemma

In statistics, the Neyman-Pearson lemma states that when performing a hypothesis test between two point hypotheses "H"0: "θ"="θ"0 and "H"1: "θ"="θ"1, then the likelihood-ratio test which rejects "H"0 in favour of "H"1 when

:Lambda(x)=frac{ L( heta _{0} mid x)}{ L ( heta _{1} mid x)} leq eta mbox{ where } P(Lambda(X)leq eta|H_0)=alpha

is the most powerful test of size "α" for a threshold η. If the test is most powerful for all heta_1 in Theta_1, it is said to be uniformly most powerful (UMP) for alternatives in the set Theta_1 , .

In practice, the likelihood ratio is often used directly to construct tests — see Likelihood-ratio test. However it can also be used to suggest particular test-statistics that might be of interest or to suggest simplified tests — for this one considers algebraic manipulation of the ratio to see if there are key statistics in it is related to the size of the ratio (i.e. whether a large statistic corresponds to a small ratio or to a large one).

Proof

If we define the rejection region of the null hypothesis, as R_{NP}={ X: frac{L( heta_{0},X)}{L( heta_{1},X)} leq eta} , and any other test will have a different rejection region that we define as R_{A}. Furthermore define the function of region, and parameter P(R, heta)=int_{R} L( heta|x) dx, hence this is the probability of the data falling in region R, given parameter heta.

For both tests to have significance level alpha, it must be true thatalpha= P(R_{NP}, heta_{0})=P(R_{A}, heta_{0}), however it is useful to break these down into integrals over distinct regions.

:P(R_{NP} cap R_{A}, heta) + P(R_{NP} cap R_{A}^{c}, heta) = P(R_{NP}, heta) and: P(R_{NP} cap R_{A}, heta) + P(R_{NP}^{c} cap R_{A}, heta) = P(R_{A}, heta)

Setting heta= heta_{0} and equating the above two expression, yields thatP(R_{NP} cap R_{A}^{c}, heta_{0}) = P(R_{NP}^{c} cap R_{A}, heta_{0})

Comparing the power of the two tests, which are P(R_{NP}, heta_{1}) and P(R_{A}, heta_{1}) one can see that

:P(R_{NP}, heta_{1}) geq P(R_{A}, heta_{1}) mbox{ if, and only if, }P(R_{NP} cap R_{A}^{c}, heta_{1}) geq P(R_{NP}^{c} cap R_{A}, heta_{1}) .

Now by the definition of R_{NP}

: P(R_{NP} cap R_{A}^{c}, heta_{1})= int_{R_{NP}cap R_{A}^{c L( heta_{1}|x)dx geq frac{1}{eta} int_{R_{NP}cap R_{A}^{c L( heta_{0}|x)dx = frac{1}{eta}P(R_{NP} cap R_{A}^{c}, heta_{0}): = frac{1}{eta}P(R_{NP}^{c} cap R_{A}, heta_{0}) = frac{1}{eta}int_{R_{NP}^{c} cap R_{A L( heta_{0}|x)dx geq int_{R_{NP}^{c}cap R_{A L( heta_{1}|x)dx = P(R_{NP}^{c} cap R_{A}, heta_{1})

Hence the inequality holds.

Example

Let X_1,dots,X_n be a random sample from the mathcal{N}(mu,sigma^2) distribution where the mean mu is known, and suppose that we wish to test for H_0:sigma^2=sigma_0^2 against H_1:sigma^2=sigma_1^2.

The likelihood for this set of normally distributed data is

:Lleft(sigma^2;mathbf{x} ight)propto left(sigma^2 ight)^{-n/2} expleft{-frac{sum_{i=1}^n left(x_i-mu ight)^2}{2sigma^2} ight}.

We can compute the likelihood ratio to find the key statistic in this test and its effect on the test's outcome:

:Lambda(mathbf{x}) = frac{Lleft(sigma_1^2;mathbf{x} ight)}{Lleft(sigma_0^2;mathbf{x} ight)} = left(frac{sigma_1^2}{sigma_0^2} ight)^{-n/2}expleft{-frac{1}{2}(sigma_1^{-2}-sigma_0^{-2})sum_{i=1}^n left(x_i-mu ight)^2 ight}.

This ratio only depends on the data through sum_{i=1}^n left(x_i-mu ight)^2. Therefore, by the Neyman-Pearson lemma, the most powerful test of this type of hypothesis for this data will depend only on sum_{i=1}^n left(x_i-mu ight)^2. Also, by inspection, we can see that if sigma_1^2>sigma_0^2, then Lambda(mathbf{x}) is a decreasing function of sum_{i=1}^n left(x_i-mu ight)^2. So we should reject H_0 if sum_{i=1}^n left(x_i-mu ight)^2 is sufficiently small. The rejection threshold depends on the size of the test.

ee also

* Statistical power
* Receiver operating characteristic

References

* cite journal
title=On the Problem of the Most Efficient Tests of Statistical Hypotheses
author=Jerzy Neyman, Egon Pearson
journal=Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character
volume=231
year=1933
pages=289–337
url=http://links.jstor.org/sici?sici=0264-3952%281933%29231%3C289%3AOTPOTM%3E2.0.CO%3B2-X
doi=10.1098/rsta.1933.0009

* [http://cnx.org/content/m11548/latest/ cnx.org: Neyman-Pearson criterion]

External links

* MIT OpenCourseWare lecture notes: [http://ocw.mit.edu/NR/rdonlyres/Mathematics/18-443Fall2003/18B765F6-A398-48BF-A893-49A4965DED98/0/lec19.pdf most powerful tests] , [http://ocw.mit.edu/NR/rdonlyres/Mathematics/18-443Fall2003/D6F12E47-A9A2-4FE0-AC3C-588B6A5EE5B6/0/lec20.pdf uniformly most powerful tests]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Neyman-Pearson-Lemma — Das Neyman Pearson Lemma ist ein Satz der mathematischen Statistik, der eine Optimalitätsaussage über die Konstruktion eines Hypothesentests macht. Gegenstand des Neyman Pearson Lemmas ist das denkbar einfachste Szenario eines Hypothesentests:… …   Deutsch Wikipedia

  • Neyman–Pearson lemma — In statistics, the Neyman Pearson lemma, named after Jerzy Neyman and Egon Pearson, states that when performing a hypothesis test between two point hypotheses H0: θ = θ0 and H1: θ = θ1, then the likelihood ratio test …   Wikipedia

  • Egon Sharpe Pearson — (* 11. August 1895 in Hampstead; † 12. Juni 1980 London) war ein britischer Statistiker. Er ist der Sohn von Karl Pearson. Pearson folgte seinem Vater als Professor für Statistik am University College London. Er war Herausgeber der Zeitschrift… …   Deutsch Wikipedia

  • Jerzy Neyman — (* 16. April 1894 in Bendery, Moldawien; † 5. August 1981 in Oakland, Kalifornien) war ein polnischer Mathematiker und Autor wichtiger statistischer Bücher. Das Neyman Pearson Lemma ist nach ihm benannt. Neyman in Warschau 1973 …   Deutsch Wikipedia

  • Jerzy Neyman — Born April 16, 1894(1894 04 16) Bendery, Bessarabia, Imperial Russia Died August 5, 1981(1981 …   Wikipedia

  • Egon Pearson — Egon Sharpe Pearson (* 11. August 1895 in Hampstead; † 12. Juni 1980 Midhurst) war ein britischer Statistiker. Er ist der Sohn von Karl Pearson. Pearson folgte seinem Vater als Professor für Statistik am University College London. Er war… …   Deutsch Wikipedia

  • Karl Pearson — Infobox Scientist name = Karl Pearson |300px caption = Karl Pearson (né Carl Pearson) birth date = birth date|1857|3|27|mf=y birth place = Islington, London, England death date = death date and age|1936|4|27|1857|3|27|mf=y death place =… …   Wikipedia

  • Egon Pearson — Egon Sharpe Pearson (Hampstead, 11 August 1895 – London, 12 June 1980) was the only son of Karl Pearson, and like his father, a leading British statistician. He went to Winchester School and Trinity College, Cambridge, and succeeded his father as …   Wikipedia

  • Type I and type II errors — In statistics, the terms Type I error (also, α error, or false positive) and type II error (β error, or a false negative) are used to describe possible errors made in a statistical decision process. In 1928, Jerzy Neyman (1894 1981) and Egon… …   Wikipedia

  • Founders of statistics — Statistics is the theory and application of mathematics to the scientific method including hypothesis generation, experimental design, sampling, data collection, data summarization, estimation, prediction and inference from those results to the… …   Wikipedia