# GC-content

﻿
GC-content

GC-content (or guanine-cytosine content), in molecular biology, is the percentage of nitrogenous bases on a DNA molecule which are either guanine or cytosine (from a possibility of four different ones, also including adenine and thymine). [ [http://cancerweb.ncl.ac.uk/cgi-bin/omd?GC+content Definition of GC – content on CancerWeb of Newcastle University,UK] ] This may refer to a specific fragment of DNA or RNA, or that of the whole genome. When it refers to a fragment of the genetic material, it may denote the GC-content of part of a gene (domain), single gene, group of genes (or gene clusters) or even a non-coding region. G (guanine) and C (cytosine) undergo a specific hydrogen bonding whereas A (adenine) bonds specifically with T (thymine). The GC pair is bound by three hydrogen bonds and AT paired by two hydrogen bonds, and thus GC pairs are more thermostable compared to the AT pairs. [cite book | author = Freidfelder D| title = Microbial Genetics| edition = 1st ed. | publisher = Jones and Barlett Publishers | year =1990 | id = ISBN 81-85198-33-0] In spite of the higher thermostability conferred to the genetic material, it is envisaged that cells with high GC DNA undergo autolysis, thereby reducing the longevity of the cell "per se". [ [http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=7999&dopt=Abstract Levin RE. and Van Sickle C.,Autolysis of high-GC isolates of Pseudomonas putrefaciens. Antonie Van Leeuwenhoek. 42(1-2):145-55, 1976] ] Due to the robustness endowed to the genetic materials in high GC organisms it was commonly believed that the GC content played a vital part in adaptation temperatures, a hypothesis which has recently been refuted. [Hurst, LD. and Merchant, AR. High Guanine-Cytosine Content is Not an Adaptation to High Temperature: A Comparative Analysis amongst Prokaryotes Proceedings: Biological Sciences, 268(466) 493-497, 2001.]

In PCR experiments, the GC-content of primers are used to determine their annealing temperature to the template DNA. A higher GC-content level indicates a higher melting temperature.

Determination of GC content

GC content is usually expressed as a percentage value, but sometimes as a ratio (called G+C ratio or GC-ratio). GC-content percentage is calculated as [cite book | author =Madigan,MT. and Martinko JM. | title = Brock biology of microorganisms| edition = 10th ed. | publisher =Pearson-Prentice Hall | year = 2003| id = ISBN 84-205-3679-2]

:$cfrac\left\{G+C\right\}\left\{A+T+G+C\right\} imes 100$

whereas the G+C ratio is calculated as [ [http://www.biochem.northwestern.edu/holmgren/Glossary/Definitions/Def-A/A+T_G+C_ratio.html Definition of GC-ratio on Northwestern University, IL, USA] ] ::$cfrac\left\{A+T\right\}\left\{G+C\right\}$ .

The GC-content percentages as well as GC-ratio can be measured by several means but one of the simplest methods is to measure what is called the melting temperature of the DNA double helix using spectrophotometry. The absorbance of DNA at a wavelength of 260 nm increases fairly sharply when the double-stranded DNA separates into two single strands when sufficiently heated. [ [http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=156059 Wilhelm,J., Pingoud, A. and Hahn, M. Real-time PCR-based method for the estimation of genome sizes. Nucleic Acids Res. 15; 31(10):56, 2003.] ] The most commonly used protocol for determining GC ratios uses flow cytometry for large number of samples. [ [http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=7518377&dopt=Abstract Vinogradov, AE. Measurement by flow cytometry of genomic AT/GC ratio and genome size. Cytometry. 1;16(1):34-40, 1994] ]

Alternatively, if the DNA or RNA molecule under investigation has been sequenced then the GC-content can be accurately calculated by simple arithmetic.

GC ratio of genomes

GC ratios within a genome is found to be markedly variable. These variations in GC ratio within a genome of higher organisms results in a mosaic like formation with islet regions called isochores. [ [http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=10607893&dopt=Abstract Bernardi, G., Isochores and the evolutionary genomics of vertebrates, Gene, 241:3-17, 2000.] ] This results in the variations in staining intensity in the chromosomes. [ [http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12700172 Furey, T.S. and Haussler, D., Integration of the cytogenetic map with the draft human genome sequence, Human Molecular Genetics, 12(9):1037-1044, 2003.] ] The isochores include in them essential protein coding genes, termed "housekeeping genes" and thus determination of ratio of these specific regions contributes in mapping these essential genes. [ [http://cat.inist.fr/?aModele=afficheN&cpsidt=4851783 Sumner, A.T., de la Torre, J. and Stuppia, L., The distribution of genes on chromosomes: A cytological approach, Journal of Molecular Evolution, 37:117-122, 1993.] ] [ [http://www.ncbi.nlm.nih.gov/sites/entrez?cmd=Retrieve&db=PubMed&list_uids=1937049&dopt=Abstract Aïssani, B., and Bernardi, G., CpG islands, genes and isochores in the genomes of vertebrates, Gene, 106:185-195, 1991.] ]

GC ratios and coding sequence

Within a long region of genomic sequence, genes are often characterised by having a higher GC-content in contrast to the background GC-content for the entire genome. Evidence of GC ratio with that of length of the coding region of a gene have showed that the length of the coding sequence is directly proportional to higher G+C content. [ [http://www.springerlink.com/content/l1kx1knn0wv8w0kh/ Oliver, JL. and Marín,A., A Relationship Between GC Content and Coding-Sequence Length Journal of Molecular Evolution 43(3)216-223, 2004] ] This has been pointed to the fact that the stop codon has a bias towards A and T nucleotides and thus shorter the sequence higher the AT bias. [Wuitschick, JD. and Karrer, KM., Analysis of Genomic G + C Content, Codon Usage, Initiator Codon Context and Translation Termination Sites in Tetrahymena thermophila The Journal of Eukaryotic Microbiology, 46(3) 239–247, 1999.]

Application in systematics

GC content is found to be variable with different organisms, the process of which is envisaged to be contributed to by variation in selection, mutational bias and biased recombination-associated DNA repair. [ [http://genomebiology.com/2002/3/10/reports/0058 Birdsell, JA., Integrating genomics, bioinformatics and classical genetics to study the effects of recombination on genome evolution. Mol Biol Evol 2002, 19:1181-1197.] ] The species problem in prokaryotic taxonomy has led to various suggestions in classifying bacteria and the "ad hoc committee on reconciliation of approaches to bacterial systematics" has recommended use of GC ratios in higher level hierarchical classification. [ [http://cat.inist.fr/?aModele=afficheN&cpsidt=7750835 Wayne, LG et al., Report of the ad hoc committee on reconciliation of approaches to bacterial systematic International journal of systematic bacteriology 37,(4). 463-464, 1987.] ] For example, the Actinobacteria are characterised as "high GC-content bacteria" [ [http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Tree&id=1760&lvl=3&lin=f&keep=1&srchmode=1&unlock Taxonomy browser on NCBI] ] . In "Streptomyces coelicolor" A3(2), GC content is 72%. [ [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genomeprj&cmd=Retrieve&dopt=Overview&list_uids=242 Whole genome data of "Streptomyces coelicolor" A3(2) on NCBI] ] The GC-content of Yeast ("Saccharomyces cerevisiae") is 38%, [ [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genomeprj&cmd=Retrieve&dopt=Overview&list_uids=128 Whole genome data of "Saccharomyces cerevisiae" on NCBI] ] and that of another common model organism Thale Cress ("Arabidopsis thaliana") is 36%. [ [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genomeprj&cmd=Retrieve&dopt=Overview&list_uids=116 Whole genome data of " Arabidopsis thaliana" on NCBI] ] Because of the nature of the genetic code, it is virtually impossible for an organism to have a genome with a GC-content approaching either 0% or 100%. A species with an extremely low GC-content is "Plasmodium falciparum" (GC% = ~20%), [ [http://www.ncbi.nlm.nih.gov/sites/entrez?db=genomeprj&cmd=Retrieve&dopt=Overview&list_uids=148 Whole genome data of "Plasmodium falciparum" on NCBI] ] and it is usually common to refer to such examples as being AT-rich instead of GC-poor. [ Musto, H., Caccio, S., Rodriguez-Maseda, H. and Bernadi, G. Compositional Constraints in the Extremely GC-poorGenome of Plasmodium falciparum. Mem Inst Oswaldo Cruz, 92(6): 835-841, 1997.> [http://www.scielo.br/pdf/mioc/v92n6/3431.pdf Full Article] ]

References

# [http://insilico.ehu.es/oligoweb/index2.php?m=all Table with GC-content of all sequenced prokaryotes]
# [http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Tree&id=2&lvl=3&srchmode=1&keep=1&unlock Taxonomic browser of bacteria based on GC ratio on NCBI website] .

Wikimedia Foundation. 2010.

### Look at other dictionaries:

• content — content, ente [ kɔ̃tɑ̃, ɑ̃t ] adj. • fin XIIIe; lat. contentus, de continere → contenir ♦ Satisfait. 1 ♦ Content de qqch. Vx Comblé, qui n a plus besoin d autre chose. « Qui vit content de rien possède toute chose » (Boileau). Subst. m. (fin XVe) …   Encyclopédie Universelle

• Content management — Content management, or CM, is the set of processes and technologies that support the collection, managing, and publishing of information in any form or medium. In recent times this information is typically referred to as content or, to be precise …   Wikipedia

• Content analysis — or textual analysis is a methodology in the social sciences for studying the content of communication. Earl Babbie defines it as the study of recorded human communications, such as books, websites, paintings and laws. According to Dr. Farooq… …   Wikipedia

• Content-addressable storage — Content addressable storage, also referred to as associative storage or abbreviated CAS, is a mechanism for storing information that can be retrieved based on its content, not its storage location. It is typically used for high speed storage and… …   Wikipedia

• Content Migration — is the process of moving information stored on a Web content management system(CMS), Digital asset management(DAM), Document management system(DMS), or flat HTML based system to a new system. Flat HTML content can entail HTML files, Active Server …   Wikipedia

• Content strategy — has been growing as a practice within the industry of web development since the late 1990s. It is recognized as a field in user experience design but has also drawn interest from practitioners in adjacent communities such as content management,… …   Wikipedia

• Content-centric networking — (also content based networking, data oriented networking[1] or named data networking[2]) is an alternative approach to the architecture of computer networks. Its founding principle is that a communication network should allow a user to focus on… …   Wikipedia

• Content storage management — (CSM) is a technique for the evolution of traditional media archive technology used by media companies and content owners to store and protect valuable file based media assets. CSM solutions focus on active management of content and media assets… …   Wikipedia

• content — content, ente (kon tan, tan t ) adj. 1°   Qui se contente de, qui s accommode de, se borne à. Content de peu. •   Le sage y vit en paix [sous l humble toit] et méprise le reste ; Content de ses douceurs, errant parmi les bois, Il regarde à ses… …   Dictionnaire de la Langue Française d'Émile Littré

• Content-based image retrieval — (CBIR), also known as query by image content (QBIC) and content based visual information retrieval (CBVIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in… …   Wikipedia

• Content Reserve — Opened 2000 Platforms Windows 98, Windows 2000, Windows XP, Windows Vista Format DRM Protected WMA, WMV, EPUB, PDF, PRC, and LIT; MP3 …   Wikipedia

### Share the article and excerpts

Do a right-click on the link above