Chargaff's rules

Chargaff's rules

Chargaff's rules state that DNA from any cell of all organisms should have a 1:1 ratio (base Pair Rule)of pyrimidine and purine bases and, more specifically, that the amount of guanine is equal to cytosine and the amount of adenine is equal to thymine. This pattern is found in both strands of the DNA. They were discovered by Austrian chemist Erwin Chargaff.[1][2][3][4][5][6]


Chargaff Parity Rule 1

The first rule holds that a double-stranded DNA molecule globally has percentage base pair equality: %A = %T and %G = %C.[6] The rigorous validation of the rule constitutes the basis of Watson-Crick pairs in the DNA double helix.

Chargaff Parity Rule 2

The second rule holds that both %A ~ %T and %G ~ %C are valid for each of the two DNA strands.[7] This describes only a global feature of the base composition in a single DNA strand.[8]


The second of Chargaff's rules (or "Chargaff's second parity rule") is that the composition of DNA varies from one species to another; in particular in the relative amounts of A, G, T, and C bases. Such evidence of molecular diversity, which had been presumed absent from DNA, made DNA a more credible candidate for the genetic material than protein.

In 2006, it was shown that this rule applies to four of the five types of double stranded genomes; specifically it applies to the eukaryotic chromosomes, the bacterial chromosomes, the double stranded DNA viral genomes, and the archeal chromosomes.[9] It does not apply to the organellar genomes (mitochondria and plastids) (actually it applies to many plastid and organellar genomes that are longer than ~20-30 kbp) nor does it apply to the single stranded DNA (viral) genomes or any type of RNA genome (probably, because all such genomes are very small). The basis for this rule is still under investigation.

The rule itself has consequences. In most bacterial genomes (which are generally 80-90% coding) genes are arranged in such a fashion that approximately 50% of the coding sequence lies on either strand. Wacław Szybalski, in the 1960s, showed that in bacteriophage coding sequences purines (A and G) exceed pyrimidines (C and T).[10] This rule has since been confirmed in other organisms and should probably be now termed "Szybalski's rule". While Szybalski's rule generally holds, exceptions are known to exist.[11][12][13] The biological basis for Szybalski's rule, like Chargaff's, is not yet known.

The combined effect of Chargaff's second rule and Szybalski's rule can be seen in bacterial genomes where the coding sequences are not equally distributed. The genetic code has 64 codons of which 3 function as termination codons: there are only 20 amino acids normally present in proteins. (There are two uncommon amino acids—selenocysteine and pyrrolysine—found in a limited number of proteins and encoded by the stop codons - TGA and TAG respectively.) The mismatch between the number of codons and amino acids allows several codons to code for a single amino acid. These codons normally differ in the third codon base position.

Multivariate statistical analysis of codon use within genomes with unequal quantities of coding sequences on the two strands has shown that codon use in the third position depends on the strand on which the gene is located. This seems likely to be the result of Szybalski's and Chargaff's rules. Because of the asymmetry in pyrimidine and purine use in coding sequences, the strand with the greater coding content will tend to have the greater number of purine bases (Szybalski's rule). Because the number of purine bases will to a very good approximation equal the number of their complementary pyrimidines within the same strand and because the coding sequences occupy 80-90% of the strand, there appears to be (1) a selective pressure on the third base to minimize the number of purine bases in the strand with the greater coding content; and (2) that this pressure is proportional to the mismatch in the length of the coding sequences between the two strands.

The origin of the deviation from Chargaff's rule in the organelles has been suggested to be a consequence of the mechanism of replication.[14] During replication the DNA strands separate. In single stranded DNA, cytosine spontaneously slowly deaminates to adenosine (a C to A transversion). The longer the strands are separated the greater the quantity of deamination. For reasons that are not yet clear the strands tend to exist longer in single form in mitochondria than in chromsomal DNA. This process tends to yield one strand that is enriched in guanine (G) and thymine (T) with its complement enriched in cytosine (C) and adenosine (A), and this process may have given rise to the deviations found in the mitochondria.

Chargaff's second rule appears to be the consequence of a more complex parity rule: within a single strand of DNA any oligonucleotide is present in equal numbers to its reverse complementary nucleotide. Because of the computational requirements this has not been verified in all genomes for all oligonucleotides. It has been verified for triplet oligonucleotides for a large data set.[15] Albrecht-Buehler has suggested that this rule is the consequence of genomes evolving by a process of inversion and transposition.[15] This process does not appear to have acted on the mitochondrial genomes. Chargaff's second parity rule appears to be extended from the nucleotide-level to populations of codon triplets, in the case of whole single-stranded Human genome DNA.[16] A kind of "codon-level second Chargaff's parity rule" is proposed as follows:

Codon populations where 1st base position is T are identical to codon populations where 3rd base position is A: 
« % codons Twx ~ % codons yzA » (where Twx and yzA are mirror codons i.e TCG and CGA).
Codon populations where 1st base position is C are identical to codon populations where 3rd base position is G: 
« % codons Cwx ~ % codons yzG » (where Cwx and yzG are mirror codons i.e CTA and TAG).
Codon populations where 2nd base position is T are identical to codon populations where 2nd base position is A: 
« % codons wTx ~ % codons yAz » (where wTx and yAz are mirror codons i.e CTG and CAG).
Codon populations where 2nd base position is C are identical to codon populations where 2nd base position is G:
« % codons wCx ~ % codons yGz » (where wCx and yGz are mirror codons i.e TCT and AGA).
Codon populations where 3rd base position is T are identical to codon populations where 1st base position is A: 
« % codons wxT ~ % codons Ayz » (where wxT and Ayz are mirror codons i.e CTT and AAG).
Codon populations where 3rd base position is C are identical to codon populations where 1st base position is G: 
« % codons wxC ~ % codons Gyz » (where wxC and Gyz are mirror codons i.e GGC and GCC).
Examples - computing whole human genome using the first codons reading frame provides:
36530115 TTT and 36381293 AAA (ratio % = 1.00409). 2087242 TCG and 2085226 CGA (ratio % = 1.00096), etc...

Relative proportions (%) of bases in DNA

The following table is a representative sample of Erwin Chargaff's 1952 data, listing the base composition of DNA from various organisms and support both of Chargaff's rules.[17]

Organism %A %G %C %T A/T G/C %GC %AT
φX174 24.0 23.3 21.5 31.2 0.77 1.08 44.8 55.2
Maize 26.8 22.8 23.2 27.2 0.99 0.98 46.1 54.0
Octopus 33.2 17.6 17.6 31.6 1.05 1.00 35.2 64.8
Chicken 28.0 22.0 21.6 28.4 0.99 1.02 43.7 56.4
Rat 28.6 21.4 20.5 28.4 1.01 1.00 42.9 57.0
Human 29.3 20.7 20.0 30.0 0.98 1.04 40.7 59.3
Grasshopper 29.3 20.5 20.7 29.3 1.00 0.99 41.2 58.6
Sea Urchin 32.8 17.7 17.3 32.1 1.02 1.02 35.0 64.9
Wheat 27.3 22.7 22.8 27.1 1.01 1.00 45.5 54.4
Yeast 31.3 18.7 17.1 32.9 0.95 1.09 35.8 64.4
E. coli 24.7 26.0 25.7 23.6 1.05 1.01 51.7 48.3

See also


  1. ^ Elson D, Chargaff E (1952). "On the deoxyribonucleic acid content of sea urchin gametes". Experientia 8 (4): 143–145. doi:10.1007/BF02170221. PMID 14945441. 
  2. ^ Chargaff E, Lipshitz R, Green C (1952). "Composition of the deoxypentose nucleic acids of four genera of sea-urchin". J Biol Chem 195 (1): 155–160. PMID 14938364. 
  3. ^ Chargaff E, Lipshitz R, Green C, Hodes ME (1951). "The composition of the deoxyribonucleic acid of salmon sperm". J Biol Chem 192 (1): 223–230. PMID 14917668. 
  4. ^ Chargaff E (1951). "Some recent studies on the composition and structure of nucleic acids". J Cell Physiol Suppl 38 (Suppl). 
  5. ^ Magasanik B, Vischer E, Doniger R, Elson D, Chargaff E (1950). "The separation and estimation of ribonucleotides in minute quantities". J Biol Chem 186 (1): 37–50. PMID 14778802. 
  6. ^ a b Chargaff E (1950). "Chemical specificity of nucleic acids and mechanism of their enzymatic degradation". Experientia 6 (6): 201–209. doi:10.1007/BF02173653. PMID 15421335. 
  7. ^ Rudner R, Karkas JD, Chargaff E (1968). Separation of B. subtilis DNA into complementary strands. III. Direct Analysis. Proc Natl Acad Sci USA, 60:921-922. PubMed
  8. ^ a b Zhang CT, Zhang R, Ou HY (2003). "The Z curve database: a graphic representation of genome sequences". Bioinformatics 19 (5): 593–599. doi:10.1093/bioinformatics/btg041. PMID 12651717. 
  9. ^ Mitchell D, Bridge R (2006). "A test of Chargaff's second rule". Biochem Biophys Res Commun 340 (1): 90–94. doi:10.1016/j.bbrc.2005.11.160. PMID 16364245. 
  10. ^ Szybalski W, Kubinski H, Sheldrick O (1966). "Pyrimidine clusters on the transcribing strand of DNA and their possible role in the initiation of RNA synthesis". Cold Spring Harbor Symp Quant Biol 31: 123–127. PMID 4966069. 
  11. ^ Cristillo AD (1998). Characterization of G0/G1 switch genes in cultured T lymphocytes. Kingston, Ontario, Canada: Queen's University. 
  12. ^ Bell SJ, Forsdyke DR (1999). "Deviations from Chargaff's second parity rule correlate with direction of transcription". J Theor Biol 197 (1): 63–76. doi:10.1006/jtbi.1998.0858. PMID 10036208. 
  13. ^ Lao PJ, Forsdyke DR (2000). "Thermophilic Bacteria Strictly Obey Szybalski's Transcription Direction Rule and Politely Purine-Load RNAs with Both Adenine and Guanine". Genome 10 (2): 228–236. doi:10.1101/gr.10.2.228. PMC 310832. PMID 10673280. 
  14. ^ Nikolaou C, Almirantis Y (2006). "Deviations from Chargaff's second parity rule in organellar DNA. Insights into the evolution of organellar genomes". Gene 381: 34–41. doi:10.1016/j.gene.2006.06.010. PMID 16893615. 
  15. ^ a b Albrecht-Buehler G (2006). "Asymptotically increasing compliance of genomes with Chargaff's second parity rules through inversions and inverted transpositions". Proc Natl Acad Sci USA 103 (47): 17828–17833. doi:10.1073/pnas.0605553103. PMC 1635160. PMID 17093051. 
  16. ^ Perez, J.-C. (September 2010). "Codon populations in single-stranded whole human genome DNA are fractal and fine-tuned by the Golden Ratio 1.618". Interdisciplinary Sciences: Computational Life Science 2 (3): 228–240. doi:10.1007/s12539-010-0022-0. PMID 20658335. 
  17. ^ Bansal M (2003). "DNA structure: Revisiting the Watson-Crick double helix". Current Science 85 (11): 1556–1563. 
  18. ^ Hallin PF, David Ussery D (2004). "CBS Genome Atlas Database: A dynamic storage for bioinformatic results and sequence data". Bioinformatics 20 (18): 3682–3686. doi:10.1093/bioinformatics/bth423. PMID 15256401. 

Further reading

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Chargaff — Chargaff: Family name Meaning Judaic Hebrew abbreviation (Hebrew: חרְגָ״ף ‎) of (Hebrew: חָתָן רַבִּי גּ״ףּ״ ‎) son in law of Rabbi G.P. , e.g. Gershon Pinchas Region of origin Jewish (Galicia, Poland, Bukovina) Language(s) of origin Hebrew …   Wikipedia

  • CHARGAFF, ERWIN — (1905–2002), U.S. biochemist. Chargaff was born in Czernowitz, then Austro Hungary, and gained a doctorate in chemistry from the University of Vienna (1928). He held postdoctoral research posts consecutively at the Universities of Yale, Vienna,… …   Encyclopedia of Judaism

  • Erwin Chargaff — (Czernowitz, August 11, 1905 ndash; New York City, USA, June 20, 2002) was an Austrian Jewish biochemist who emigrated to the United States during the Nazi era. Through careful experimentation, Chargaff discovered two rules that helped lead to… …   Wikipedia

  • Francis Crick — Infobox Scientist name = Francis Harry Compton Crick |225px image width = 225px caption = Francis Harry Compton Crick birth date = 8 June 1916 birth place = Weston Favell, Northamptonshire, England residence = UK, U.S. nationality = British death …   Wikipedia

  • Molecular models of DNA — While this purified DNA precipitated in a water jug (left) appears to be a formless mass, nucleic acids actually possess intricate structure at the nanoscale (right). M …   Wikipedia

  • Modern Evolution of Genetics Timeline — This is a timeline of events concerning the Modern Evolution of Genetics, from Gregor Mendel to present day. Contents 1 1800s 2 1900s 3 1910s 4 1920s …   Wikipedia

  • DNA — For a non technical introduction to the topic, see Introduction to genetics. For other uses, see DNA (disambiguation). The structure of the DNA double helix. The atoms in the structure are colour coded by element and the detailed structure of two …   Wikipedia

  • List of geneticists — This is a list of people who have made notable contributions to genetics. The growth and development of genetics represents the work of many people. This list of geneticists is therefore by no means complete. Contributors of great distinction to… …   Wikipedia

  • Scientific method — …   Wikipedia

  • heredity — /heuh red i tee/, n., pl. heredities. Biol. 1. the transmission of genetic characters from parents to offspring: it is dependent upon the segregation and recombination of genes during meiosis and fertilization and results in the genesis of a new… …   Universalium