 Levenberg–Marquardt algorithm

In mathematics and computing, the Levenberg–Marquardt algorithm (LMA)^{[1]} provides a numerical solution to the problem of minimizing a function, generally nonlinear, over a space of parameters of the function. These minimization problems arise especially in least squares curve fitting and nonlinear programming.
The LMA interpolates between the Gauss–Newton algorithm (GNA) and the method of gradient descent. The LMA is more robust than the GNA, which means that in many cases it finds a solution even if it starts very far off the final minimum. For wellbehaved functions and reasonable starting parameters, the LMA tends to be a bit slower than the GNA. LMA can also be viewed as Gauss–Newton using a trust region approach.
The LMA is a very popular curvefitting algorithm used in many software applications for solving generic curvefitting problems. However, the LMA finds only a local minimum, not a global minimum.
Contents
The problem
The primary application of the Levenberg–Marquardt algorithm is in the least squares curve fitting problem: given a set of m empirical datum pairs of independent and dependent variables, (x_{i}, y_{i}), optimize the parameters β of the model curve f(x,β) so that the sum of the squares of the deviations
becomes minimal.
The solution
Like other numeric minimization algorithms, the Levenberg–Marquardt algorithm is an iterative procedure. To start a minimization, the user has to provide an initial guess for the parameter vector, β. In cases with only one minimum, an uninformed standard guess like β^{T}=(1,1,...,1) will work fine; in cases with multiple minima, the algorithm converges only if the initial guess is already somewhat close to the final solution.
In each iteration step, the parameter vector, β, is replaced by a new estimate, β + δ. To determine δ, the functions are approximated by their linearizations
where
is the gradient (rowvector in this case) of f with respect to β.
At its minimum, the sum of squares, S(β), the gradient of S with respect to δ will be zero. The above firstorder approximation of gives
 .
Or in vector notation,
 .
Taking the derivative with respect to δ and setting the result to zero gives:
where is the Jacobian matrix whose i^{th} row equals J_{i}, and where and are vectors with i^{th} component and y_{i}, respectively. This is a set of linear equations which can be solved for δ.
Levenberg's contribution is to replace this equation by a "damped version",
where I is the identity matrix, giving as the increment, δ, to the estimated parameter vector, β.
The (nonnegative) damping factor, λ, is adjusted at each iteration. If reduction of S is rapid, a smaller value can be used, bringing the algorithm closer to the Gauss–Newton algorithm, whereas if an iteration gives insufficient reduction in the residual, λ can be increased, giving a step closer to the gradient descent direction. Note that the gradient of S with respect to β equals . Therefore, for large values of λ, the step will be taken approximately in the direction of the gradient. If either the length of the calculated step, δ, or the reduction of sum of squares from the latest parameter vector, β + δ, fall below predefined limits, iteration stops and the last parameter vector, β, is considered to be the solution.
Levenberg's algorithm has the disadvantage that if the value of damping factor, λ, is large, inverting J^{T}J + λI is not used at all. Marquardt provided the insight that we can scale each component of the gradient according to the curvature so that there is larger movement along the directions where the gradient is smaller. This avoids slow convergence in the direction of small gradient. Therefore, Marquardt replaced the identity matrix, I, with the diagonal matrix consisting of the diagonal elements of J^{T}J, resulting in the Levenberg–Marquardt algorithm:
 .
A similar damping factor appears in Tikhonov regularization, which is used to solve linear illposed problems, as well as in ridge regression, an estimation technique in statistics.
Choice of damping parameter
Various moreorless heuristic arguments have been put forward for the best choice for the damping parameter λ. Theoretical arguments exist showing why some of these choices guaranteed local convergence of the algorithm; however these choices can make the global convergence of the algorithm suffer from the undesirable properties of steepestdescent, in particular very slow convergence close to the optimum.
The absolute values of any choice depends on how wellscaled the initial problem is. Marquardt recommended starting with a value λ_{0} and a factor ν>1. Initially setting λ=λ_{0} and computing the residual sum of squares S(β) after one step from the starting point with the damping factor of λ=λ_{0} and secondly with λ_{0}/ν. If both of these are worse than the initial point then the damping is increased by successive multiplication by ν until a better point is found with a new damping factor of λ_{0}ν^{k} for some k.
If use of the damping factor λ/ν results in a reduction in squared residual then this is taken as the new value of λ (and the new optimum location is taken as that obtained with this damping factor) and the process continues; if using λ/ν resulted in a worse residual, but using λ resulted in a better residual then λ is left unchanged and the new optimum is taken as the value obtained with λ as damping factor.
Example
In this example we try to fit the function y = acos(bX) + bsin(aX) using the Levenberg–Marquardt algorithm implemented in GNU Octave as the leasqr function. The 3 graphs Fig 1,2,3 show progressively better fitting for the parameters a=100, b=102 used in the initial curve. Only when the parameters in Fig 3 are chosen closest to the original, are the curves fitting exactly. This equation is an example of very sensitive initial conditions for the Levenberg–Marquardt algorithm. One reason for this sensitivity is the existence of multiple minima — the function cos(βx) has minima at parameter value and
Notes
 ^ The algorithm was first published by Kenneth Levenberg, while working at the Frankford Army Arsenal. It was rediscovered by Donald Marquardt who worked as a statistician at DuPont and independently by Girard, Wynn and Morrison.
See also
References
 Kenneth Levenberg (1944). "A Method for the Solution of Certain NonLinear Problems in Least Squares". The Quarterly of Applied Mathematics 2: 164–168.
 A. Girard (1958). Rev. Opt 37: 225, 397.
 C.G. Wynne (1959). "Lens Designing by Electronic Digital Computer: I". Proc. Phys. Soc. London 73 (5): 777. doi:10.1088/03701328/73/5/310.
 Jorje J. Moré and Daniel C. Sorensen (1983). "Computing a TrustRegion Step". SIAM J. Sci. Stat. Comput. (4): 553–572.
 D.D. Morrison (1960). Jet Propulsion Laboratory Seminar proceedings.
 Donald Marquardt (1963). "An Algorithm for LeastSquares Estimation of Nonlinear Parameters". SIAM Journal on Applied Mathematics 11 (2): 431–441. doi:10.1137/0111030.
 Philip E. Gill and Walter Murray (1978). "Algorithms for the solution of the nonlinear leastsquares problem". SIAM Journal on Numerical Analysis 15 (5): 977–992. doi:10.1137/0715063.
 Nocedal, Jorge; Wright, Stephen J. (2006). Numerical Optimization, 2nd Edition. Springer. ISBN 0387303030.
External links
Descriptions
 Detailed description of the algorithm can be found in Numerical Recipes in C, Chapter 15.5: Nonlinear models
 C. T. Kelley, Iterative Methods for Optimization, SIAM Frontiers in Applied Mathematics, no 18, 1999, ISBN 0898714338. Online copy
 History of the algorithm in SIAM news
 A tutorial by Ananth Ranganathan
 Methods for NonLinear Least Squares Problems by K. Madsen, H.B. Nielsen, O. Tingleff is a tutorial discussing nonlinear leastsquares in general and the LevenbergMarquardt method in particular
 T. Strutz: Data Fitting and Uncertainty (A practical introduction to weighted least squares and beyond). Vieweg+Teubner, ISBN 9783834810229.
Implementations
 LevenbergMarquardt is a builtin algorithm with Mathematica
 LevenbergMarquardt is a builtin algorithm with Matlab
 LevenbergMarquardt is a builtin algorithm with Origin
 The oldest implementation still in use is lmdif, from MINPACK, in Fortran, in the public domain. See also:
 lmfit, a translation of lmdif into C/C++ with an easytouse wrapper for curve fitting, public domain.
 The GNU Scientific Library library has a C interface to MINPACK.
 C/C++ Minpack includes the Levenberg–Marquardt algorithm.
 Several highlevel languages and mathematical packages have wrappers for the MINPACK routines, among them:
 Python library scipy, module
scipy.optimize.leastsq
,  IDL, addon MPFIT.
 R (programming language) has the minpack.lm package.
 Python library scipy, module
 levmar is an implementation in C/C++ with support for constraints, distributed under the GNU General Public License.
 levmar includes a MEX file interface for MATLAB
 Perl (PDL), python and Haskell interfaces to levmar are available: see PDL::Fit::Levmar, PyLevmar and HackageDB levmar.
 sparseLM is a C implementation aimed at minimizing functions with large, arbitrarily sparse Jacobians. Includes a MATLAB MEX interface.
 InMin library contains a C++ implementation of the algorithm based on the eigen C++ linear algebra library. It has a pure Clanguage API as well as a Python binding
 ALGLIB has implementations of improved LMA in C# / C++ / Delphi / Visual Basic. Improved algorithm takes less time to converge and can use either Jacobian or exact Hessian.
 NMath has an implementation for the .NET Framework.
 gnuplot uses its own implementation gnuplot.info.
 Java programming language implementations: 1) Javanumerics, 2) LMApackage (a small, user friendly and well documented implementation with examples and support), 3) Apache Commons Math
 OOoConv implements the LM algorithm as an OpenOffice.org Calc spreadsheet.
 SAS, there are multiple ways to access SAS's implementation of the LevenbergMarquardt algorithm: it can be accessed via NLPLM Call in PROC IML and it can also be accessed through the LSQ statement in PROC NLP, and the METHOD=MARQUARDT option in PROC NLIN.
Optimization: Algorithms, methods, and heuristics Unconstrained nonlinear: Methods calling ... ... and gradients... and HessiansConstrained nonlinear GeneralDifferentiableAugmented Lagrangian methods · Sequential quadratic programming · Successive linear programmingConvex minimization GeneralBasisexchangeCombinatorial ParadigmsApproximation algorithm · Dynamic programming · Greedy algorithm · Integer programming (Branch & bound or cut)Graph algorithmsMetaheuristics Categories: Statistical algorithms
 Optimization methods
 Least squares
Wikimedia Foundation. 2010.
Look at other dictionaries:
LevenbergMarquardtAlgorithmus — Der Levenberg Marquardt Algorithmus, benannt nach Kenneth Levenberg und Donald Marquardt, ist ein numerischer Optimierungsalgorithmus zur Lösung nichtlinearer Ausgleichs Probleme mit Hilfe der Methode der kleinsten Quadrate. Das Verfahren… … Deutsch Wikipedia
LevenbergMarquardt — Algorithme de Levenberg Marquardt L’algorithme de Levenberg Marquardt, ou algorithme LM, permet d obtenir une solution numérique au problème de minimisation d une fonction, souvent non linéaire et dépendant de plusieurs variables. L algorithme… … Wikipédia en Français
Algorithme de LevenbergMarquardt — L’algorithme de Levenberg Marquardt, ou algorithme LM, permet d obtenir une solution numérique au problème de minimisation d une fonction, souvent non linéaire et dépendant de plusieurs variables. L algorithme interpole l algorithme de Gauss… … Wikipédia en Français
Algorithme De LevenbergMarquardt — L’algorithme de Levenberg Marquardt, ou algorithme LM, permet d obtenir une solution numérique au problème de minimisation d une fonction, souvent non linéaire et dépendant de plusieurs variables. L algorithme interpole l algorithme de Gauss… … Wikipédia en Français
Algorithme de levenbergmarquardt — L’algorithme de Levenberg Marquardt, ou algorithme LM, permet d obtenir une solution numérique au problème de minimisation d une fonction, souvent non linéaire et dépendant de plusieurs variables. L algorithme interpole l algorithme de Gauss… … Wikipédia en Français
MarquardtLevenberg — Algorithme de Levenberg Marquardt L’algorithme de Levenberg Marquardt, ou algorithme LM, permet d obtenir une solution numérique au problème de minimisation d une fonction, souvent non linéaire et dépendant de plusieurs variables. L algorithme… … Wikipédia en Français
Gauss–Newton algorithm — The Gauss–Newton algorithm is a method used to solve non linear least squares problems. It can be seen as a modification of Newton s method for finding a minimum of a function. Unlike Newton s method, the Gauss–Newton algorithm can only be used… … Wikipedia
Donald Marquardt — Donald W. Marquardt (March 13, 1929 – July 5, 1997[1]) was an American statistician, the rediscoverer of the Levenberg–Marquardt nonlinear least squares fitting algorithm.[2] Marquardt joined DuPont in 1953 and worked there for 39 years. He also… … Wikipedia
Donald Marquardt — Donald W. Marquardt (13 mars 1929 5 juillet 1997[1]) est un statisticien américain qui a publié l algorithme d ajustement aux moindres carrés d équations non linéaires, développé par Kenneth Levenberg et connu sous le nom d algorithme de… … Wikipédia en Français
Crisscross algorithm — This article is about an algorithm for mathematical optimization. For the naming of chemicals, see crisscross method. The criss cross algorithm visits all 8 corners of the Klee–Minty cube in the worst case. It visits 3 additional… … Wikipedia