 Cramér–Rao bound

In estimation theory and statistics, the Cramér–Rao bound (CRB) or Cramér–Rao lower bound (CRLB), named in honor of Harald Cramér and Calyampudi Radhakrishna Rao who were among the first to derive it,^{[1]}^{[2]}^{[3]} expresses a lower bound on the variance of estimators of a deterministic parameter. The bound is also known as the Cramér–Rao inequality or the information inequality.
In its simplest form, the bound states that the variance of any unbiased estimator is at least as high as the inverse of the Fisher information. An unbiased estimator which achieves this lower bound is said to be efficient. Such a solution achieves the lowest possible mean squared error among all unbiased methods, and is therefore the minimum variance unbiased (MVU) estimator. However, in some cases, no unbiased technique exists which achieves the bound. This may occur even when an MVU estimator exists.
The Cramér–Rao bound can also be used to bound the variance of biased estimators of given bias. In some cases, a biased approach can result in both a variance and a mean squared error that are below the unbiased Cramér–Rao lower bound; see estimator bias.
Contents
Statement
The Cramér–Rao bound is stated in this section for several increasingly general cases, beginning with the case in which the parameter is a scalar and its estimator is unbiased. All versions of the bound require certain regularity conditions, which hold for most wellbehaved distributions. These conditions are listed later in this section.
Scalar unbiased case
Suppose θ is an unknown deterministic parameter which is to be estimated from measurements x, distributed according to some probability density function f(x;θ). The variance of any unbiased estimator of θ is then bounded by the inverse of the Fisher information I(θ):
where the Fisher information I(θ) is defined by
and is the natural logarithm of the likelihood function and E denotes the expected value.
The efficiency of an unbiased estimator measures how close this estimator's variance comes to this lower bound; estimator efficiency is defined as
or the minimum possible variance for an unbiased estimator divided by its actual variance. The Cramér–Rao lower bound thus gives
General scalar case
A more general form of the bound can be obtained by considering an unbiased estimator T(X) of a function ψ(θ) of the parameter θ. Here, unbiasedness is understood as stating that E{T(X)} = ψ(θ). In this case, the bound is given by
where ψ'(θ) is the derivative of ψ(θ), and I(θ) is the Fisher information defined above.
Apart from being a bound on estimators of functions of the parameter, this approach can be used to derive a bound on the variance of biased estimators with a given bias, as follows. Consider an estimator with bias , and let ψ(θ) = b(θ) + θ. By the result above, any unbiased estimator whose expectation is ψ(θ) has variance greater than or equal to (ψ'(θ))^{2} / I(θ). Thus, any estimator whose bias is given by a function b(θ) satisfies
Clearly, the unbiased version of the bound is a special case of this result, with b(θ) = 0.
Of course, it's trivial to have a small variance − an "estimator" that is constant has a variance of zero. But from the above equation we find that the mean squared error of a biased estimator is bounded by
and this can be less than the unbiased Cramér–Rao bound 1/I(θ). See the example of estimating variance below.
Multivariate case
Extending the Cramér–Rao bound to multiple parameters, define a parameter column vector
with probability density function which satisfies the two regularity conditions below.
The Fisher information matrix is a matrix with element I_{m,k} defined as
Let be an estimator of any vector function of parameters, , and denote its expectation vector by . The Cramér–Rao bound then states that the covariance matrix of satisfies
where
 The matrix inequality is understood to mean that the matrix A − B is positive semidefinite, and
 is the Jacobian matrix whose ijth element is given by .
If is an unbiased estimator of (i.e., ), then the Cramér–Rao bound reduces toRegularity conditions
The bound relies on two weak regularity conditions on the probability density function, f(x;θ), and the estimator T(X):
 The Fisher information is always defined; equivalently, for all x such that f(x;θ) > 0,

 exists, and is finite.
 The operations of integration with respect to x and differentiation with respect to θ can be interchanged in the expectation of T; that is,

 whenever the righthand side is finite.
 This condition can often be confirmed by using the fact that integration and differentiation can be swapped when either of the following cases hold:
 The function f(x;θ) has bounded support in x, and the bounds do not depend on θ;
 The function f(x;θ) has infinite support, is continuously differentiable, and the integral converges uniformly for all θ.
Simplified form of the Fisher information
Suppose, in addition, that the operations of integration and differentiation can be swapped for the second derivative of f(x;θ) as well, i.e.,
In this case, it can be shown that the Fisher information equals
The Cramèr–Rao bound can then be written as
In some cases, this formula gives a more convenient technique for evaluating the bound.
Singleparameter proof
The following is a proof of the general scalar case of the Cramér–Rao bound, which was described above; namely, that if the expectation of T is denoted by ψ(θ), then, for all θ,
Let X be a random variable with probability density function f(x;θ). Here T = t(X) is a statistic, which is used as an estimator for ψ(θ). If V is the score, i.e.
then the expectation of V, written E(V), is zero. If we consider the covariance cov(V,T) of V and T, we have cov(V,T) = E(VT), because E(V) = 0. Expanding this expression we have
This may be expanded using the chain rule
and the definition of expectation gives, after cancelling f(x;θ),
because the integration and differentiation operations commute (second condition).
The Cauchy–Schwarz inequality shows that
therefore
which proves the proposition.
Examples
Multivariate normal distribution
For the case of a dvariate normal distribution
the Fisher information matrix has elements^{[4]}
where "tr" is the trace.
For example, let w[n] be a sample of N independent observations) with unknown mean θ and known variance σ^{2}
Then the Fisher information is a scalar given by
and so the Cramér–Rao bound is
Normal variance with known mean
Suppose X is a normally distributed random variable with known mean μ and unknown variance σ^{2}. Consider the following statistic:
Then T is unbiased for σ^{2}, as E(T) = σ^{2}. What is the variance of T?
(the second equality follows directly from the definition of variance). The first term is the fourth moment about the mean and has value 3(σ^{2})^{2}; the second is the square of the variance, or (σ^{2})^{2}. Thus
Now, what is the Fisher information in the sample? Recall that the score V is defined as
where L is the likelihood function. Thus in this case,
where the second equality is from elementary calculus. Thus, the information in a single observation is just minus the expectation of the derivative of V, or
Thus the information in a sample of n independent observations is just n times this, or
The Cramer Rao bound states that
In this case, the inequality is saturated (equality is achieved), showing that the estimator is efficient.
However, we can achieve a lower mean squared error using a biased estimator. The estimator
obviously has a smaller variance, which is in fact
Its bias is
so its mean squared error is
which is clearly less than the Cramér–Rao bound found above.
When the mean is not known, the minimum mean squared error estimate of the variance of a sample from Gaussian distribution is achieved by dividing by n+1, rather than n1 or n+2.
See also
 Chapman–Robbins bound
 Kullback's inequality
References and notes
 ^ Cramér, Harald (1946). Mathematical Methods of Statistics. Princeton, NJ: Princeton Univ. Press. ISBN 0691080046. OCLC 185436716.
 ^ Rao, Calyampudi Radakrishna (1945). "Information and the accuracy attainable in the estimation of statistical parameters". Bulletin of the Calcutta Mathematical Society 37: 81–89. MR0015748.
 ^ Rao, Calyampudi Radakrishna (1994). S. Das Gupta. ed. Selected Papers of C. R. Rao. New York: Wiley. ISBN 9780470220917. OCLC 174244259.
 ^ Kay, S. M. (1993). Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice Hall. p. 47. ISBN 0130422681.
Further reading
Categories: Statistical inequalities
 Estimation theory
Wikimedia Foundation. 2010.
Look at other dictionaries:
Cota de CramérRao — Saltar a navegación, búsqueda En estadística, la cota de Cramér Rao (abreviada CRB por sus siglas del inglés) o cota inferior de Cramér Rao (CRLB), llamada así en honor a Harald Cramér y Calyampudi Radhakrishna Rao, expresa una cota inferior para … Wikipedia Español
Chapman–Robbins bound — In statistics, the Chapman–Robbins bound or Hammersley–Chapman–Robbins bound is a lower bound on the variance of estimators of a deterministic parameter. It is a generalization of the Cramér–Rao bound; compared to the Cramér–Rao bound, it is both … Wikipedia
Calyampudi Radhakrishna Rao — Infobox Scientist name = C. R. Rao 300px caption = Calyampudi Radhakrishna Rao FRS birth date = Birth date and age1920910mf=y birth place = Hadagali, State of Mysore, India residence = India nationality = death date = death place = field =… … Wikipedia
Harald Cramér — Born 25 September 1893(1893 09 25) Stockholm, Sweden … Wikipedia
Estimation theory — is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data. The parameters describe an underlying physical setting in such a way that the value of the parameters affects… … Wikipedia
Ordinary least squares — This article is about the statistical properties of unweighted linear regression analysis. For more general regression analysis, see regression analysis. For linear regression on a single variable, see simple linear regression. For the… … Wikipedia
Efficiency (statistics) — In statistics, efficiency is one measure of desirability of an estimator.The efficiency of an unbiased statistic T is defined as:e(T)=frac{1/mathcal{I}( heta)}{mathrm{var}(T)}where mathcal{I}( heta) is the Fisher information of the sample.Thus… … Wikipedia
Estimator — In statistics, an estimator is a function of the observable sample data that is used to estimate an unknown population parameter (which is called the estimand ); an estimate is the result from the actual application of the function to a… … Wikipedia
List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… … Wikipedia
List of mathematics articles (C) — NOTOC C C closed subgroup C minimal theory C normal subgroup C number C semiring C space C symmetry C* algebra C0 semigroup CA group Cabal (set theory) Cabibbo Kobayashi Maskawa matrix Cabinet projection Cable knot Cabri Geometry Cabtaxi number… … Wikipedia