- Ensemble Kalman filter
The ensemble Kalman filter (EnKF) is a
recursive filtersuitable for problems with a large number of variables, such as discretizations of partial differential equations in geophysical models. The EnKF originated as a version of the Kalman filterfor large problems (essentially, the covariance matrixis replaced by the sample covariance), and it is now an important data assimilationcomponent of ensemble forecasting. EnKF is related to the particle filter(in this context, a particle is the same thing as ensemble member) but the EnKF makes the assumption that all probability distributions involved are Gaussian; when it is applicable, it is much more efficient than the particle filter. This article briefly describes the derivation and practical implementation of the basic version of EnKF, and reviews several extensions.
The Ensemble Kalman Filter (EnKF) is a Monte-Carlo implementation of the Bayesian update problem: Given a
probability density function(pdf) of the state of the modeled system (the "prior", called often the forecast in geosciences) and the data likelihood, the Bayes theoremis used to obtain the pdf after the data likelihood has been taken into account (the "posterior", often called the analysis). This is called a Bayesian update. The Bayesian update is combined with advancing the model in time, incorporating new data from time to time. The original Kalman FilterR. E. Kalman, "A new approach to linear filtering and prediction problems", Transactions of the ASME -- Journal of Basic Engineering, Series D, 82 (1960), pp. 35--45.
] assumes that all pdfs are Gaussian (the Gaussian assumption) and provides algebraic formulas for the change of the
meanand the covariance matrixby the Bayesian update, as well as a formula for advancing the covariance matrix in time provided the system is linear. However, maintaining the covariance matrix is not feasible computationally for high-dimensional systems. For this reason, EnKFs were developed. G. Evensen, "Sequential data assimilation with nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics", Journal of Geophysical Research, 99 (C5) (1994), pp. 143--162.
] P. Houtekamer and H. L. Mitchell, "Data assimilation using an ensemble Kalman filter technique", Monthly Weather Review, 126 (1998), pp. 796--811.
] EnKFs represent the distribution of the system state using a collection of state vectors, called an ensemble, and replace the covariance matrix by the
sample covariancecomputed from the ensemble. The ensemble is operated with as if it were a random sample, but the ensemble members are really not independent - the EnKF ties them together. One advantage of EnKFs is that advancing the pdf in time is achieved by simply advancing each member of the ensemble. For a survey of EnKF and related data assimilation techniques, see. G. Evensen, "Data assimilation : The ensemble Kalman filter, Springer, Berlin, 2007.
] A version of this article is also available as the technical report. J. Mandel, "A brief tutorial on the ensemble Kalman filter", CCM Report 242, University of Colorado at Denver and Health Sciences Center, February 2007. [http://www.math.cudenver.edu/ccm/reports/rep242.pdf report]
A derivation of the EnKF
The Kalman Filter
Let us review first the
Kalman filter. Let denote the -dimensional state vectorof a model, and assume that it has Gaussian probability distribution with mean and covariance , i.e., its pdf is
Here and below, means proportional; a pdf is always scaled so that its integral over the whole space is one. This , called the "prior", was evolved in time by running the model and now is to be updated to account for new data. It is natural to assume that the error distribution of the data is known; data have to come with an error estimate, otherwise they are meaningless. Here, the data is assumed to have Gaussian pdf with covariance and mean , where is the so-called the observation matrix. The covariance matrix describes the estimate of the error of the data; if the random errors in the entries of the data vector are independent, is diagonal and its diagonal entries are the squares of the
standard deviation(“error size”) of the error of the corresponding entries of the data vector . The value is what the value of the data would be for the state in the absence of data errors. Then the probability density of the data conditional of the system state , called the data likelihood, is
The data is fixed once it is received, so denote the posterior state by instead of and the posterior pdf by . It can be shown by algebraic manipulations B. D. O. Anderson and J. B. Moore, "Optimal filtering", Prentice-Hall, Englewood Cliffs, N.J., 1979.
] that the posterior pdf is also Gaussian,
with the posterior mean and covariance given by the Kalman update formulas
is the so-called Kalman gain matrix.
The Ensemble Kalman Filter
The EnKF is a Monte Carlo approximation of the Kalman filter, which avoids evolving the covariance matrix of the pdf of the state vector . Instead, the pdf is represented by an ensemble
is an matrix whose columns are the ensemble members, and it is called the "prior ensemble". Ideally, ensemble members would form a sample from the prior distribution. However, the ensemble members are not in general independent except in the initial ensemble, since every EnKF step ties them together. They are deemed to be approximately independent, and all calculations proceed as if they actually were independent.
Replicate the data into an matrix
so that each column consists of the data vector plus a random vector from the -dimensional normal distribution . If, in addition, the columns of are a sample from the
prior probabilitydistribution, then the columns of
form a sample from the
posterior probabilitydistribution. The EnKF is now obtained C. J. Johns and J. Mandel, "A two-stage ensemble Kalman filter for smooth data assimilation". Environmental and Ecological Statistics, in print. Special issue, Conference on New Developments of Statistical Analysis in Wildlife, Fisheries, and Ecological Research, Oct 13-16, 2004, Columbia, MI. CCM Report 221, University of Colorado at Denver and Health Sciences Center, 2005. [http://www.math.cudenver.edu/ccm/reports/rep221.pdf report]
] simply by replacing the state covariance in Kalman gain matrix by the sample covariance computed from the ensemble members (called the "ensemble covariance").
Here we follow. G. Burgers, P. J. van Leeuwen, and G. Evensen, "Analysis scheme in the ensemble Kalman filter", Monthly Weather Review, 126 (1998), pp. 1719--1724.
] G. Evensen, "The ensemble Kalman filter: Theoretical formulation and practical implementation", Ocean Dynamics, 53 (2003), pp. 343--367.
] Suppose the ensemble matrix and the data matrix are as above. The ensemble mean and the covariance are
and denotes the matrix of all ones of the indicated size.
The posterior ensemble is then given by
where the perturbed data matrix is as above. Since can be written as
one can see that "the posterior ensemble consists of linear combinations of members of the prior ensemble". (This is, however, no longer the case when the covariance matrix of the ensemble is artificially modified, as in localized ensemble Kalman filters.)
Note that since is a covariance matrix, it is always positive semidefinite and usually positive definite, so the inverse above exists and the formula can be implemented by the
Choleski decomposition. J. Mandel, "Efficient implementation of the ensemble Kalman filter". CCM Report 231, University of Colorado at Denver and Health Sciences Center. [http://www.math.cudenver.edu/ccm/reports/rep231.pdf link] , June 2006.
Since these formulas are matrix operations with dominant Level 3 operations, G. H. Golub and C. F. V. Loan, "Matrix Computations", Johns Hopkins Univ. Press, 1989. Second Edition.
] they are suitable for efficient implementation using software packages such as
LAPACK(on serial and shared memorycomputers) and ScaLAPACK(on distributed memorycomputers). Instead of computing the inverse of a matrix and multiplying by it, it is much better (several times cheaper and also more accurate) to compute the Choleski decompositionof the matrix and treat the multiplication by the inverse as solution of a linear system with many simultaneous right-hand sides.
Observation matrix-free implementation
It is usually inconvenient to construct and operate with the matrix explicitly; instead, a function of the form
is more natural to compute. The function is called the "
observation function" or, in the inverse problems context, the "forward operator". The value of is what the value of the data would be for the state assuming the measurement is exact. Then the posterior ensemble can be rewritten as
Consequently, the ensemble update can be computed by evaluating the observation function on each ensemble member once and the matrix does not need to be known explicitly. This formula holds also for an observation function with a fixed offset , which also does not need to be known explicitly. The above formula has been commonly used for a nonlinear observation function , such as the position of a
hurricane vortex. Y. Chen and C. Snyder, "Assimilating vortex position with an ensemble Kalman filter". Monthly Weather Review, to appear, 2006. [http://www.mmm.ucar.edu/people/snyder/papers/ChenSnyder2006_draft.pdf preprint] .
] In that case, the observation function is essentially approximated by a linear function from its values at ensemble members.
Implementation for a large number of data points
For a large number of data points, the multiplication by becomes a bottleneck. The following alternative formula is advantageous when the number of data points is large (such as when assimilating gridded or pixel data) and the data error covariance matrix is diagonal (which is the case when the data errors are uncorrelated), or cheap to decompose (such as banded due to limited covariance distance). Using the
Sherman–Morrison–Woodbury formulaW. W. Hager, "Updating the inverse of a matrix", SIAM Rev., 31 (1989), pp. 221--239.
The EnKF version described here involves randomization of data. For filters without randomization of data, see. J. L. Anderson, "An ensemble adjustment Kalman filter for data assimilation", Monthly Weather Review, 129 (1999), pp. 2884--2903.
] G. Evensen, "Sampling strategies and square root analysis schemes for the EnKF", Ocean Dynamics, 54 (2004), pp. 539--560.
] M. K. Tippett, J. L. Anderson, C. H. Bishop, T. M. Hamill, and J. S. Whitaker, "Ensemble square root filters", Monthly Weather Review, 131 (2003), pp. 1485--1490.
Since the ensemble covariance is
rank deficient(there are many more state variables, typically millions, than the ensemble members, typically less than a hundred), it has large terms for pairs of points that are spatially distant. Since in reality the values of physical fields at distant locations are not that much correlated, the covariance matrix is tapered off artificially based on the distance, which gives rise to localized EnKF algorithms. J. L. Anderson, "A local least squares framework for ensemble filtering", Monthly Weather Review, 131 (2003), pp. 634--642.] E. Ott, B. R. Hunt, I. Szunyogh, A. V. Zimin, E. J. Kostelich, M. Corazza, E. Kalnay, D. Patil, and J. A. Yorke, "A local ensemble Kalman filter for atmospheric data assimilation", Tellus A, 56 (2004), pp. 415--428.] These methods modify the covariance matrix used in the computations and, consequently, the posterior ensemble is no longer made only of linear combinations of the prior ensemble.
For problems with
coherent features, such as firelines, squall lines, and rain fronts, there is a need to adjust the simulation state by distorting the state in space as well as by an additive correction to the state. The morphing EnKF J. D. Beezley and J. Mandel, "Morphing ensemble Kalman filters". Tellus A, submitted. CCM Report 240, University of Colorado at Denver and Health Sciences Center, February 2007, [http://www-math.cudenver.edu/ccm/reports/rep240.pdf report] .
] J. Mandel and J. D. Beezley, "Predictor-corrector and morphing ensemble filters for the assimilation of sparse data into high dimensional nonlinear systems". CCM Report 239, University of Colorado at Denver and Health Sciences Center. [http://www.math.cudenver.edu/ccm/reports/rep239.pdf report] , November 2006. 11th Symposium on Integrated Observing and Assimilation Systems for the Atmosphere, Oceans, and Land Surface (IOAS-AOLS), CD-ROM, Paper 4.12, 87th American Meteorological Society Annual Meeting, San Antonio, TX, January 2007, [http://www.ametsoc.org link] .
] employs intermediate states, obtained by techniques borrowed from
image registrationand morphing, instead of linear combinations of states.
EnKFs rely on the Gaussian assumption, though they are of course used in practice for nonlinear problems, where the Gaussian assumption is not satisfied. Related filters attempting to relax the Gaussian assumption in EnKF while preserving its advantages include filters that fit the state pdf with multiple Gaussian kernels, J. L. Anderson and S. L. Anderson, "A Monte Carlo implementation of the nonlinear filtering problem to produce ensemble assimilations and forecasts", Monthly Weather Review, 127 (1999), pp. 2741--2758.
] filters that approximate the state pdf by
Gaussian mixtures, T. Bengtsson, C. Snyder, and D. Nychka, "Toward a nonlinear ensemble filter for high dimensional systems", Journal of Geophysical Research - Atmospheres, 108(D24) (2003), pp. STS 2--1--10. [http://www.image.ucar.edu/pub/nychka/manuscripts/bengtsson.pdf preprint] .
] a variant of the
particle filterwith computation of particle weights by density estimation, and a variant of the particle filter with thick tailed data pdf to alleviate particle filter degeneracy. P. van Leeuwen, "A variance-minimizing filter for large-scale applications", Monthly Weather Review, 131 (2003), pp. 2071--2084.]
Numerical weather prediction#Ensembles
Recursive Bayesian estimation
* [http://www.cmascenter.org/conference/2006/abstracts/zubrow_session7.pdf Ensemble Adjusted Kalman Filter applied to CO measurements] (EAKF):"(EAKF-CMAQ: DEVELOPMENT AND INITIAL EVALUATION OF AN ENSEMBLE ADJUSTMENT KALMAN FILTER BASED DATA ASSIMILATION FOR CO)"
* [http://enkf.nersc.no EnKF webpage] (EnKF)
* [http://topaz.nersc.no TOPAZ, real-time forecasting of the North Atlantic ocean and Arctic sea-ice with the EnKF] (TOPAZ)
Wikimedia Foundation. 2010.
Look at other dictionaries:
Kalman filter — Roles of the variables in the Kalman filter. (Larger image here) In statistics, the Kalman filter is a mathematical method named after Rudolf E. Kálmán. Its purpose is to use measurements observed over time, containing noise (random variations)… … Wikipedia
Extended Kalman filter — In estimation theory, the extended Kalman filter (EKF) is the nonlinear version of the Kalman filter which linearizes about the current mean and covariance. The EKF is often considered the de facto standard in the theory of nonlinear state… … Wikipedia
Ensemble forecasting — is a numerical prediction method that is used to attempt to generate a representative sample of the possible future states of a dynamical system. Multiple numerical predictions are conducted using slightly different initial conditions that are… … Wikipedia
Particle filter — Particle filters, also known as sequential Monte Carlo methods (SMC), are sophisticated model estimation techniques based on simulation. They are usually used to estimate Bayesian models and are the sequential ( on line ) analogue of Markov chain … Wikipedia
Data assimilation — Applications of data assimilation arise in many fields of geosciences, perhaps most importantly in weather forecasting and hydrology. Data assimilation proceeds by analysis cycles. In each analysis cycle, observations of the current (and possibly … Wikipedia
Фильтр Калмана — Фильтр Калмана эффективный рекурсивный фильтр, оценивающий вектор состояния динамической системы, используя ряд неполных и зашумленных измерений. Назван в честь Рудольфа Калмана. Фильтр Калмана широко используется в инженерных и… … Википедия
List of statistics topics — Please add any Wikipedia articles related to statistics that are not already on this list.The Related changes link in the margin of this page (below search) leads to a list of the most recent changes to the articles listed below. To see the most… … Wikipedia
Estimation theory — is a branch of statistics and signal processing that deals with estimating the values of parameters based on measured/empirical data. The parameters describe an underlying physical setting in such a way that the value of the parameters affects… … Wikipedia
List of mathematics articles (E) — NOTOC E E₇ E (mathematical constant) E function E₈ lattice E₈ manifold E∞ operad E7½ E8 investigation tool Earley parser Early stopping Earnshaw s theorem Earth mover s distance East Journal on Approximations Eastern Arabic numerals Easton s… … Wikipedia
List of numerical analysis topics — This is a list of numerical analysis topics, by Wikipedia page. Contents 1 General 2 Error 3 Elementary and special functions 4 Numerical linear algebra … Wikipedia