Root mean square deviation (bioinformatics)

Root mean square deviation (bioinformatics)

The root mean square deviation (RMSD) is the measure of the average distance between the backbones of superimposed proteins. In the study of globular protein conformations, one customarily measures the similarity in three-dimensional structure by the RMSD of the Cα atomic coordinates after optimal rigid body superposition.

When a dynamical system fluctuates about some well-defined average position the RMSD from the average over time can be referred to as the RMSF or root mean square fluctuation. The size of this fluctuation can be measured, eg using Mössbauer spectroscopy or nuclear magnetic resonance, and can provide important physical information. The Lindemann index is a method of placing the RMSF in the context of the parameters of the system.

A widely used way to compare the structures of biomolecules or solid bodies is to translate and rotate one structure with respect to the other to minimize the RMSD. Coutsias, "et al." presented a simple derivation, based on quaternions, for the optimal solid body transformation (rotation-translation) that minimizes the RMSD between two sets of vectors.cite journal | author = Coutsias EA, Seok C, Dill KA | title = Using quaternions to calculate RMSD | journal = J Comput Chem | volume = 25 | issue = 15 | pages = 1849–1857 | year = 2004 | pmid = 15376254 | doi = 10.1002/jcc.20110] They proved that the quaternion method is equivalent to the well-known formula due to Kabsch.cite journal | author = Kabsch W | title = A solution for the best rotation to relate two sets of vectors | journal = Acta Crystallographica | volume = 32 | pages = 922–923 | year = 1976]

The equation


where δ is the distance between N pairs of equivalent atoms (usually "Cα" and sometimes "C","N","O","Cβ").

Normally a rigid superposition which minimizes the RMSD is performed, and this minimum is returned. Given two sets of n points mathbf{v} and mathbf{w}, the RMSD is defined as follows:

An RMSD value is expressed in length units. The most commonly used unit in structural biology is the Ångström (Å) which is equal to 10–10m.


Typically RMSD is used to make a quantitative comparison between the structure of a partially folded protein and the structure of the native state. For example, the CASP protein structure prediction competition uses RMSD as one of its assessments of how well a submitted structure matches the native state.

Also some scientists who study protein folding simulations use RMSD as a reaction coordinate to quantify where the protein is between the folded state and the unfolded state.

ee also

*Root mean square deviation
*Root mean square fluctuation
*Quaternion—used to optimise RMSD calculations
*Kabsch algorithm—an algorithm used to minimize the RMSD by first finding the best rotationcite journal | author = Kabsch W | title = A solution for the best rotation to relate two sets of vectors | journal = Acta Crystallographica | volume = 32 | pages = 922–923 | year = 1976]
* [ Secondary Structure Matching (SSM)] — a tool for protein structure comparison. Uses RMSD.
* [ SuperPose] — a protein superposition server. Uses RMSD.
* [ superpose] — structural alignment based on secondary structure matching. By the CCP4 project. Uses RMSD.


Further reading
* Armougom F, Moretti S, Keduas V, Notredame C (2006). "The iRMSD: a local measure of sequence alignment accuracy using structural information." "Bioinformatics, 22(14):e35-9".
* Damm KL, Carlson HA (2006). "Gaussian-weighted RMSD superposition of proteins: a structural comparison for flexible proteins and predicted protein structures." "Biophys J, 90(12):4558-73".
* Kneller GR (2005). "Comment on ``Using quaternions to calculate RMSD" [J. Comp. Chem. 25, 1849 (2004)] "." "J Comput Chem, 26(15):1660-2".
* Theobald DL (2005). "Rapid calculation of RMSDs using a quaternion-based characteristic polynomial." "Acta Crystallogr A", 61(Pt 4):478-80.
* Maiorov VN, Crippen GM (1994). "Significance of root-mean-square deviation in comparing three-dimensional structures of globular proteins." "J Mol Biol, 235(2):625-34".

External links

* [ Molecular Distance Measures] —a tutorial on how to calculate RMSD
* [ RMSD] —another tutorial on how to calculate RMSD with example code

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Root-mean-square deviation — For the application of root mean square deviation to bioinformatics, see Root mean square deviation (bioinformatics). The root mean square deviation (RMSD) or root mean square error (RMSE) is a frequently used measure of the differences between… …   Wikipedia

  • Root mean square deviation — The root mean square deviation (RMSD) ( also root mean square error (RMSE) ) is a frequently used measure of the differences between values predicted by a model or an estimator and the values actually observed from the thing being modeled or… …   Wikipedia

  • Root mean square fluctuation — The mean square fluctuation (MSF) is a measure of the deviation between the position of particle i and some reference position. MSF=frac{1}{T}sum {t j=1}^{T}(x i(t j) ilde{x} i)^2,where T is the time over which one wants to average, and ilde{x} i …   Wikipedia

  • Mean — This article is about the statistical concept. For other uses, see Mean (disambiguation). In statistics, mean has two related meanings: the arithmetic mean (and is distinguished from the geometric mean or harmonic mean). the expected value of a… …   Wikipedia

  • Median absolute deviation — In statistics, the median absolute deviation (MAD) is a robust measure of the variability of a univariate sample of quantitative data. It can also refer to the population parameter that is estimated by the MAD calculated from a sample. For a… …   Wikipedia

  • List of mathematics articles (R) — NOTOC R R. A. Fisher Lectureship Rabdology Rabin automaton Rabin signature algorithm Rabinovich Fabrikant equations Rabinowitsch trick Racah polynomials Racah W coefficient Racetrack (game) Racks and quandles Radar chart Rademacher complexity… …   Wikipedia

  • Errors and residuals in statistics — For other senses of the word residual , see Residual. In statistics and optimization, statistical errors and residuals are two closely related and easily confused measures of the deviation of a sample from its theoretical value . The error of a… …   Wikipedia

  • Homology modeling — Homology modeling, also known as comparative modeling of protein refers to constructing an atomic resolution model of the target protein from its amino acid sequence and an experimental three dimensional structure of a related homologous protein… …   Wikipedia

  • Coefficient of variation — In probability theory and statistics, the coefficient of variation (CV) is a normalized measure of dispersion of a probability distribution. It is also known as unitized risk or the variation coefficient. The absolute value of the CV is sometimes …   Wikipedia

  • Mode (statistics) — In statistics, the mode is the value that occurs most frequently in a data set or a probability distribution.[1] In some fields, notably education, sample data are often called scores, and the sample mode is known as the modal score.[2] Like the… …   Wikipedia