Delta method


Delta method

In statistics, the delta method is a method for deriving an approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator. More broadly, the delta method may be considered a fairly general central limit theorem.

Contents

Univariate delta method

While the delta method generalizes easily to a multivariate setting, careful motivation of the technique is more easily demonstrated in univariate terms. Roughly, for some sequence of random variables Xn satisfying

{\sqrt{n}[X_n-\theta]\,\xrightarrow{D}\,N(0,\sigma^2)},

where θ and σ2 are finite valued constants and \xrightarrow{D} denotes convergence in distribution, it is the case that

{\sqrt{n}[g(X_n)-g(\theta)]\,\xrightarrow{D}\,N(0,\sigma^2[g'(\theta)]^2)}

for any function g satisfying the property that g′(θ) exists and is non-zero valued. (The final restriction is really only needed for purposes of clarity in argument and application. Should the first derivative evaluate to zero at θ, then the delta method may be extended via use of a second or higher order Taylor series expansion.)

Proof in the univariate case

Demonstration of this result is fairly straightforward under the assumption that g′(θ) is continuous. To begin, we use the Mean value theorem:

g(X_n)=g(\theta)+g'(\tilde{\theta})(X_n-\theta),

where \tilde{\theta} lies between Xn and θ. Note that since X_n\,\xrightarrow{P}\,\theta implies \tilde{\theta} \,\xrightarrow{P}\,\theta and since g′(θ) is continuous, applying the continuous mapping theorem yields

g'(\tilde{\theta})\,\xrightarrow{P}\,g'(\theta),

where \xrightarrow{P} denotes convergence in probability.

Rearranging the terms and multiplying by \sqrt{n} gives

\sqrt{n}[g(X_n)-g(\theta)]=g'(\tilde{\theta})\sqrt{n}[X_n-\theta].

Since

{\sqrt{n}[X_n-\theta] \xrightarrow{D} N(0,\sigma^2)}

by assumption, it follows immediately from appeal to Slutsky's Theorem that

{\sqrt{n}[g(X_n)-g(\theta)] \xrightarrow{D} N(0,\sigma^2[g'(\theta)]^2)}.

This concludes the proof.

Motivation of multivariate delta method

By definition, a consistent estimator B converges in probability to its true value β, and often a central limit theorem can be applied to obtain asymptotic normality:


\sqrt{n}\left(B-\beta\right)\,\xrightarrow{D}\,N\left(0, \Sigma \right),

where n is the number of observations and Σ is a (symmetric positive semi-definite) covariance matrix. Suppose we want to estimate the variance of a function h of the estimator B. Keeping only the first two terms of the Taylor series, and using vector notation for the gradient, we can estimate h(B) as


h(B) \approx h(\beta) + \nabla h(\beta)^T \cdot (B-\beta)

which implies the variance of h(B) is approximately


\begin{align}
\operatorname{Var}\left(h(B)\right) & \approx \operatorname{Var}\left(h(\beta) + \nabla h(\beta)^T \cdot (B-\beta)\right) \\

 & = \operatorname{Var}\left(h(\beta) + \nabla h(\beta)^T \cdot B - \nabla h(\beta)^T \cdot \beta\right) \\

 & = \operatorname{Var}\left(\nabla h(\beta)^T \cdot B\right) \\

 & = \nabla h(\beta)^T \cdot Var(B) \cdot \nabla h(\beta) \\

 & = \nabla h(\beta)^T \cdot (\Sigma/n) \cdot \nabla h(\beta)
\end{align}

One can use the mean value theorem (for real-valued functions of many variables) to see that this does not rely on taking first order approximation.

The delta method therefore implies that


\sqrt{n}\left(h(B)-h(\beta)\right)\,\xrightarrow{D}\,N\left(0, \nabla h(\beta)^T \cdot \Sigma \cdot \nabla h(\beta) \right)

or in univariate terms,


\sqrt{n}\left(h(B)-h(\beta)\right)\,\xrightarrow{D}\,N\left(0, \sigma^2 \cdot \left(h^\prime(\beta)\right)^2 \right).

Example

Suppose Xn is Binomial with parameters p and n. Since

{\sqrt{n} \left[ \frac{X_n}{n}-p \right]\,\xrightarrow{D}\,N(0,p (1-p))},

we can apply the Delta method with g(θ) = log(θ) to see

{\sqrt{n} \left[ \log\left( \frac{X_n}{n}\right)-\log(p)\right] \,\xrightarrow{D}\,N(0,p (1-p) [1/p]^2)}

Hence, the variance of  \log \left( \frac{X_n}{n} \right) is approximately

 \frac{1-p}{p\,n}. \,\!

Moreoever, if \hat p and \hat q are estimates of different group rates from independent samples of sizes n and m respectively, then the logarithm of the estimated relative risk  \frac{\hat p}{\hat q} is approximately normally distributed with variance that can be estimated by  \frac{1-\hat p}{\hat p \, n}+\frac{1-\hat q}{\hat q \, m} . This is useful to construct a hypothesis test or to make a confidence interval for the relative risk.

Note

The delta method is often used in a form that is essentially identical to that above, but without the assumption that Xn or B is asymptotically normal. Often the only context is that the variance is "small". The results then just give approximations to the means and covariances of the transformed quantities. For example, the formulae presented in Klein (1953, p. 258) are:


\begin{align}
\operatorname{Var} \left( h_r \right) = & \sum_i 
  \left( \frac{ \partial h_r }{ \partial B_i } \right)^2
  \operatorname{Var}\left( B_i \right) + \\
 &  \sum_i \sum_{j \neq i} 
  \left( \frac{ \partial h_r }{ \partial B_i } \right)
  \left( \frac{ \partial h_r }{ \partial B_j } \right)
  \operatorname{Cov}\left( B_i, B_j \right) \\
\operatorname{Cov}\left( h_r, h_s \right) = & \sum_i 
  \left( \frac{ \partial h_r }{ \partial B_i } \right)
  \left( \frac{ \partial h_s }{ \partial B_i } \right)
  \operatorname{Var}\left( B_i \right) + \\
 &  \sum_i \sum_{j \neq i} 
  \left( \frac{ \partial h_r }{ \partial B_i } \right)
  \left( \frac{ \partial h_s }{ \partial B_j } \right)
  \operatorname{Cov}\left( B_i, B_j \right)
\end{align}

where hr is the rth element of h(B) and Biis the ith element of B. The only difference is that Klein stated these as identities, whereas they are actually approximations.

See also

References

  • Casella, G. and Berger, R. L. (2002), Statistical Inference, 2nd ed.
  • Cramér, H. (1946), Mathematical Models of Statistics, p. 353.
  • Davison, A. C. (2003), Statistical Models, pp. 33-35.
  • Greene, W. H. (2003), Econometric Analysis, 5th ed., pp. 913f.
  • Klein, L. R. (1953), A Textbook of Econometrics, p. 258.
  • Oehlert, G. W. (1992), A Note on the Delta Method, The American Statistician, Vol. 46, No. 1, p. 27-29.
  • Lecture notes
  • More lecture notes
  • Explanation from Stata software corporation

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Delta — commonly refers to: Delta (letter), Δ or δ in the Greek alphabet, also used as a mathematical symbol River delta, a landform at the mouth of a river Delta Air Lines, a major U.S. airline Delta may also refer to: Contents 1 Places …   Wikipedia

  • Delta-sigma modulation — Delta sigma (ΔΣ; or sigma delta, ΣΔ) modulation is a method for encoding high resolution or analog signals into lower resolution digital signals. The conversion is done using error feedback, where the difference between the two signals is… …   Wikipedia

  • Delta-v (disambiguation) — Delta V may refer to: Contents 1 Science 2 Technology 3 Games 4 Music Science Delta v, a term used in astrodynamics for the …   Wikipedia

  • Delta Debugging — automates the scientific method of debugging. The Delta Debugging algorithm isolates failure causes automatically by systematically narrowing down failure inducing circumstances until a minimal set remains. Delta Debugging has been applied to… …   Wikipedia

  • Method of analytic tableaux — A graphical representation of a partially built propositional tableau In proof theory, the semantic tableau (or truth tree) is a decision procedure for sentential and related logics, and a proof procedure for formulas of first order logic. The… …   Wikipedia

  • Delta potential — The delta potential is a potential that gives rise to many interesting results in quantum mechanics. It consists of a time independent Schrödinger equation for a particle in a potential well defined by a Dirac delta function in one dimension. For …   Wikipedia

  • Delta wing — HAL Tejas has a tailless delta wing configuration …   Wikipedia

  • Delta encoding — Not to be confused with Elias delta coding. Delta encoding is a way of storing or transmitting data in the form of differences between sequential data rather than complete files; more generally this is known as data differencing. Delta encoding… …   Wikipedia

  • Delta neutral — In finance, delta neutral describes a portfolio of related financial securities, in which the portfolio value remains unchanged due to small changes in the value of the underlying security. Such a portfolio typically contains options and their… …   Wikipedia

  • Delta set — In mathematics, a delta set (or Δ set) S is a combinatorial object that is useful in the construction and triangulation of topological spaces, and also in the computation of related algebraic invariants of such spaces. A delta set is somewhat… …   Wikipedia