Empirical distribution function

Empirical distribution function

In statistics, an empirical distribution function is a cumulative probability distribution function that concentrates probability 1/"n" at each of the "n" numbers in a sample.

Let X_1,ldots,X_n be iid random variables in mathbb{R} with the cdf "F"("x").

The empirical distribution function F_n(x) based on sample X_1,ldots,X_n is a step function defined by

:F_n(x) = frac{ mbox{number of elements in the sample} leq x}n = frac{1}{n} sum_{i=1}^n I(X_i le x),

where "I"("A") is the indicator of event "A".

For fixed "x", I(X_ileq x) is a Bernoulli random variable with parameter "p" = "F"("x"), hence nF_n(x) is a binomial random variable with mean "nF"("x") and variance "nF"("x")(1 − "F"("x")).

Asymptotical properties

* By the strong law of large numbers,

:: F_n(x) o F(x) almost surely for fixed "x".

:In other words, F_n(x) is a consistent unbiased estimator of the cumulative distribution function "F(x)".

* By the central limit theorem,

:: sqrt{n}(F_n(x)-F(x))

converges in distribution to a normal distribution "N"(0, "F"("x")(1 − "F"("x"))) for fixed "x".:The Berry–Esséen theorem provides the rate of this convergence.
* By the Glivenko-Cantelli theorem F_n(x) o F(x) uniformly over "x", that is :: |F_n(x)-F(x)|_infty o 0 with probability 1. :The Dvoretzky-Kiefer-Wolfowitz inequality provides the rate of this convergence.
* Kolmogorov showed that :: sqrt{n}|F_n(x)-F(x)|_infty converges in distribution to the Kolmogorov distribution, provided that "F"("x") is continuous.:The Kolmogorov-Smirnov test for "goodness-of-fit" is based on this fact.
* By Donsker's theorem,:: sqrt{n}(F_n-F), as a process indexed by "x", converges weakly in ell^infty(mathbb{R}) to a Brownian bridge "B"("F"("x")).

See also

* Càdlàg functions
* Empirical probability
* Empirical process


Wikimedia Foundation. 2010.

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

  • Cumulative distribution function — for the normal distributions in the image below …   Wikipedia

  • Empirical probability — Empirical probability, also known as relative frequency, or experimental probability, is the ratio of the number favourable outcomes to the total number of trials [ [http://www.answers.com/topic/empirical probability statistics Empirical… …   Wikipedia

  • Empirical measure — In probability theory, an empirical measure is a random measure arising from a particular realization of a (usually finite) sequence of random variables. The precise definition is found below. Empirical measures are relevant to mathematical… …   Wikipedia

  • Empirical process — The study of empirical processes is a branch of mathematical statistics and a sub area of probability theory. It is a generalization of the central limit theorem for empirical measures. DefinitionIt is known that under certain conditions… …   Wikipedia

  • Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function …   Wikipedia

  • Distribution (economics) — Distribution in economics refers to the way total output, income, or wealth is distributed among individuals or among the factors of production (such as labour, land, and capital).[1]. In general theory and the national income and product… …   Wikipedia

  • Empirical Bayes method — In statistics, empirical Bayes methods are a class of methods which use empirical data to evaluate / approximate the conditional probability distributions that arise from Bayes theorem. These methods allow one to estimate quantities… …   Wikipedia

  • Multivariate normal distribution — MVN redirects here. For the airport with that IATA code, see Mount Vernon Airport. Probability density function Many samples from a multivariate (bivariate) Gaussian distribution centered at (1,3) with a standard deviation of 3 in roughly the… …   Wikipedia

  • Characteristic function (probability theory) — The characteristic function of a uniform U(–1,1) random variable. This function is real valued because it corresponds to a random variable that is symmetric around the origin; however in general case characteristic functions may be complex valued …   Wikipedia

  • Stretched exponential function — Figure 1. Illustration of a stretched exponential fit (with β=0.52) to an empirical master curve. For comparison, a least squares single and a double exponential fit are also shown. The data are rotational anisotropy of anthracene in… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”