Conserved domain database


Conserved domain database
CDD
US-NLM-NCBI-Logo.svg
Content
Description Conserved Domain Database for the functional annotation of proteins.
Contact
Research center National Center for Biotechnology Information
Authors Aron Marchler-Bauer
Primary Citation Marchler-Bauer & al. (2011)[1]
Release date 2003
Access
Website http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml
Tools
Miscellaneous

The Conserved Domain Database (CDD) is a database of well-annotated multiple sequence alignment models and derived database search models, for ancient domains and full-length proteins.[1]

Contents

Philosophy

Domains can be thought of as distinct functional and/or structural units of a protein. These two classifications coincide rather often, as a matter of fact, and what is found as an independently folding unit of a polypeptide chain also carries specific function. Domains are often identified as recurring (sequence or structure) units, which may exist in various contexts. In molecular evolution such domains may have been utilized as building blocks, and may have been recombined in different arrangements to modulate protein function. CDD defines conserved domains as recurring units in molecular evolution, the extents of which can be determined by sequence and structure analysis.

The goal of the NCBI conserved domain curation project is to provide database users with insights into how patterns of residue conservation and divergence in a family relate to functional properties, and to provide useful links to more detailed information that may help to understand those sequence/structure/function relationships. To do this, CDD Curators include the following types of information in order to supplement and enrich the traditional multiple sequence alignments that form the foundation of domain models: 3-dimensional structures and conserved core motifs, conserved features/sites, phylogenetic organization, links to electronic literature resources.

Content

CDD content includes NCBI manually curated domain models and domain models imported from a number of external source databases (Pfam, SMART, COG, PRK, TIGRFAM). What is unique about NCBI-curated domains is that they use 3D-structure information to explicitly define domain boundaries, align blocks, amend alignment details, and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. To provide a non-redundant view of the data, CDD clusters similar domain models from various sources into superfamilies.

Searching the database

The collection is also part of NCBI’s Entrez query and retrieval system, crosslinked to numerous other resources. CDD provides annotation of domain footprints and conserved functional sites on protein sequences. Precalculated domain annotation can be retrieved for protein sequences tracked in NCBI’s Entrez system, and CDD’s collection of models can be queried with novel protein sequences via * "the CD-Search service". United States National Center for Biotechnology Information. http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi. , or at* "the Batch CD-Search". United States National Center for Biotechnology Information. http://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi. , that allows the computation and download of annotation for large sets of protein queries.

References

  1. ^ a b Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Lu F, Marchler GH, Mullokandov M, Omelchenko MV, Robertson CL, Song JS, Thanki N, Yamashita RA, Zhang D, Zhang N, Zheng C, Bryant SH (January 2011). "CDD: a Conserved Domain Database for the functional annotation of proteins". Nucleic Acids Res. 39 (Database issue): D225–9. doi:10.1093/nar/gkq1189. PMC 3013737. PMID 21109532. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=3013737. 

External links


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Protein domain — Pyruvate kinase, a protein from three domains (PDB 1pkn) A protein domain is a part of protein sequence and structure that can evolve, function, and exist independently of the rest of the protein chain. Each domain forms a compact three… …   Wikipedia

  • B3 domain — protein Name = width = caption = B3 DNA binding domain of RAV1 Symbol = AltSymbols = ATC prefix= ATC suffix= ATC supplemental= CAS number= CAS supplemental= DrugBank= EntrezGene = HGNCid = OMIM = PDB = RefSeq = UniProt = ECnumber = Chromosome =… …   Wikipedia

  • SH2 domain — Pfam box Symbol = SH2 Name = width = 220 caption = SH2 domain of human P56 Lck tyrosine kinase Pfam= PF00017 InterPro= IPR000980 SMART= SH2 PROSITE=PDOC50001 SCOP = 1sha TCDB = OPM family= OPM protein= 1xa6 PDB=PDB3|1k9aB:82 156 PDB3|1jwoA:122… …   Wikipedia

  • Protein kinase domain — Pfam box Symbol = Pkinase Name = Protein kinase domain width = caption = Pfam= PF00069 InterPro= IPR000719 SMART= TyrKc PROSITE = PDOC00629 SCOP = 1apm TCDB = OPM family= OPM protein= 2bcj PDB=PDB3|1ctpE:44 298 PDB3|2erzE:44 298 PDB3|1fotA:87 341 …   Wikipedia

  • Pleckstrin homology domain — Pfam box Symbol = PH Name = width =250 caption =PH domain of tyrosine protein kinase BTK Pfam= PF00169 InterPro= IPR001849 SMART= PH PROSITE=PDOC50003 SCOP = 1dyn TCDB = OPM family= 51 OPM protein= 1pls PDB=PDB3|1dynB:520 625 PDB3|2dynA:520 625… …   Wikipedia

  • SH3 domain — The Src homology 3 domain (or SH3 domain) is a small protein domain of about 60 amino acid residues first identified as a conserved sequence in the viral adaptor protein v Crk and the non catalytic parts of enzymes such as phospholipase and… …   Wikipedia

  • Biomolecular Object Network Databank — The Biomolecular Object Network Databank (BOND) is a bioinformatics databank containing information on small molecule and protein sequences, structures and interactions. The databank integrates a number of existing databses to provide a… …   Wikipedia

  • C11orf1 — Chromosome 11 open reading frame 1 Identifiers Symbols C11orf1; FLJ23499 External IDs …   Wikipedia

  • Escherichia coli — E. coli redirects here. For the protozoan parasite, see Entamoeba coli. For the 2011 E.coli outbreak, see 2011 E. coli O104:H4 outbreak. For a specific strain, see Escherichia coli (disambiguation). For Escherichia coli in molecular biology, see… …   Wikipedia

  • InterPro — Dieser Artikel wurde aufgrund von formalen und/oder inhaltlichen Mängeln in der Qualitätssicherung Biologie zur Verbesserung eingetragen. Dies geschieht, um die Qualität der Biologie Artikel auf ein akzeptables Niveau zu bringen. Bitte hilf mit,… …   Deutsch Wikipedia