Central limit theorem for directional statistics

Central limit theorem for directional statistics

In probability theory, the central limit theorem states conditions under which the mean of a sufficiently large number of independent random variables, each with finite mean and variance, will be approximately normally distributed.[1]

Directional statistics is the subdiscipline of statistics that deals with directions (unit vectors in Rn), axes (lines through the origin in Rn) or rotations in Rn. The means and variances of directional quantities are all finite, so that the central limit theorem may be applied to the particular case of directional statistics.[2]

This article will deal only with unit vectors in 2-dimensional space (R2) but the method described can be extended to the general case.

The central limit theorem

A sample of angles θi are measured, and since they are indefinite to within a factor of , the complex definite quantity z_i=e^{i\theta_i}=\cos(\theta_i)+i\sin(\theta_i) is used as the random variate. The probability distribution from which the sample is drawn may be characterized by its moments, which may be expressed in Cartesian and polar form:

m_n=E(z^n)= C_n +i S_n = R_n e^{i \theta_n}\,

It follows that:

C_n=E(\cos (n\theta))\,
S_n=E(\sin (n\theta))\,
R_n=|E(z^n)|=\sqrt{C_n^2+S_n^2}\,
\theta_n=\arg(E(z^n))\,

Sample moments for N trials are:

\overline{m_n}=\frac{1}{N}\sum_{i=1}^N z_i^n =\overline{C_n} +i \overline{S_n} = \overline{R_n} e^{i \overline{\theta_n}}

where

\overline{C_n}=\frac{1}{N}\sum_{i=1}^N\cos(n\theta_i)
\overline{S_n}=\frac{1}{N}\sum_{i=1}^N\sin(n\theta_i)
\overline{R_n}=\frac{1}{N}\sum_{i=1}^N |z_i^n|
\overline{\theta_n}=\frac{1}{N}\sum_{i=1}^N \arg(z_i^n)

The vector [\overline{ C_1 },\overline{ S_1 }] may be used as a representation of the sample mean (\overline{m_1}) and may be taken as a 2-dimensional random variate.[2] The bivariate central limit theorem states that the joint probability distribution for \overline{ C_1 } and \overline{ S_1 } in the limit of a large number of samples is given by:

[\overline{C_1},\overline{S_1}] \xrightarrow{d} \mathcal{N}([C_1,S_1],\Sigma/N)

where \mathcal{N}() is the bivariate normal distribution and Σ is the covariance matrix for the circular distribution:


\Sigma
=
\begin{bmatrix}
 \sigma_{CC} & \sigma_{CS} \\
 \sigma_{SC} & \sigma_{SS}
\end{bmatrix}
\quad
\sigma_{CC}=E(\cos^2\theta)-E(\cos\theta)^2\,
\sigma_{CS}=\sigma_{SC}=E(\cos\theta\sin\theta)-E(\cos\theta)E(\sin\theta)\,
\sigma_{SS}=E(\sin^2\theta)-E(\sin\theta)^2\,

Note that the bivariate normal distribution is defined over the entire plane, while the mean is confined to be in the unit ball (on or inside the unit circle). This means that the integral of the limiting (bivariate normal) distribution over the unit ball will not be equal to unity, but rather approach unity as N approaches infinity.

It is desired to state the limiting bivariate distribution in terms of the moments of the distribution.

Covariance matrix in terms of moments

Using multiple angle trigonometric identities[2]

C_2= E(\cos(2\theta)) = E(\cos^2\theta-1)=E(1-\sin^2\theta)\,
S_2= E(\sin(2\theta)) = E(2\cos\theta\sin\theta)\,

It follows that:

\sigma_{CC}=E(\cos^2\theta)-E(\cos\theta)^2 =\frac{1}{2}\left(1 + C_2 - 2C_1^2\right)
\sigma_{CS}=E(\cos\theta\sin\theta)-E(\cos\theta)E(\sin\theta)=\frac{1}{2}\left(S_2 - 2 C_1 S_1   \right)
\sigma_{SS}=E(\sin^2\theta)-E(\sin\theta)^2 =\frac{1}{2}\left(1   - C_2 - 2S_1^2\right)

The covariance matrix is now expressed in terms of the moments of the circular distribution.

The central limit theorem may also be expressed in terms of the polar components of the mean. If P(\overline{C_1},\overline{S_1})d\overline{C_1}d\overline{S_1} is the probability of finding the mean in area element d\overline{C_1}d\overline{S_1}, then that probability may also be written P(\overline{R_1}\cos(\overline{\theta_1}),\overline{R_1}\sin(\overline{\theta_1}))\overline{R_1}d\overline{R_1}d\overline{\theta_1}.

References

  1. ^ Rice (1995)[Full citation needed]
  2. ^ a b c Jammalamadaka, S. Rao; SenGupta, A. (2001). Topics in circular statistics. New Jersey: World Scientific. ISBN 9810237782. http://books.google.com/books?id=sKqWMGqQXQkC&printsec=frontcover&dq=Jammalamadaka+Topics+in+circular&hl=en&ei=iJ3QTe77NKL00gGdyqHoDQ&sa=X&oi=book_result&ct=result&resnum=1&ved=0CDcQ6AEwAA#v=onepage&q&f=false. Retrieved 2011-05-15. 

Wikimedia Foundation. 2010.

Игры ⚽ Нужно решить контрольную?

Look at other dictionaries:

  • Central limit theorem — This figure demonstrates the central limit theorem. The sample means are generated using a random number generator, which draws numbers between 1 and 100 from a uniform probability distribution. It illustrates that increasing sample sizes result… …   Wikipedia

  • Directional statistics — is the subdiscipline of statistics that deals with directions (unit vectors in Rn), axes (lines through the origin in Rn) or rotations in Rn. More generally, directional statistics deals with observations on compact Riemannian manifolds. The… …   Wikipedia

  • List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… …   Wikipedia

  • Robust statistics — provides an alternative approach to classical statistical methods. The motivation is to produce estimators that are not unduly affected by small departures from model assumptions. Contents 1 Introduction 2 Examples of robust and non robust… …   Wikipedia

  • Circular uniform distribution — In probability theory and directional statistics, a circular uniform distribution is a probability distribution on the unit circle whose density is uniform for all angles. Contents 1 Description 2 Distribution of the mean 3 Entropy …   Wikipedia

  • Normal distribution — This article is about the univariate normal distribution. For normally distributed vectors, see Multivariate normal distribution. Probability density function The red line is the standard normal distribution Cumulative distribution function …   Wikipedia

  • Cauchy distribution — Not to be confused with Lorenz curve. Cauchy–Lorentz Probability density function The purple curve is the standard Cauchy distribution Cumulative distribution function …   Wikipedia

  • Chi-squared distribution — This article is about the mathematics of the chi squared distribution. For its uses in statistics, see chi squared test. For the music group, see Chi2 (band). Probability density function Cumulative distribution function …   Wikipedia

  • Negative binomial distribution — Probability mass function The orange line represents the mean, which is equal to 10 in each of these plots; the green line shows the standard deviation. notation: parameters: r > 0 number of failures until the experiment is stopped (integer,… …   Wikipedia

  • Exponential distribution — Not to be confused with the exponential families of probability distributions. Exponential Probability density function Cumulative distribution function para …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”