Ensembl


Ensembl

Ensembl is a joint scientific project between the European Bioinformatics Institute and the Wellcome Trust Sanger Institute, which was launched in 1999 in response to the imminent completion of the Human Genome Project. Its aim is to provide a centralised resource for geneticists, molecular biologists and other researchers studying the genomes of our own species and other vertebrates.

Background

The human genome consists of 3 billion base-pairs, which code for approximately 35,000 genes. However the genome alone is of little use, unless the locations and relationships of individual genes can be identified. One option is manual annotation, whereby a team of scientists try to locate genes using experimental data from scientific journals and public databases. However this is a slow, painstaking task, so the solution chosen by Ensembl is to use the power of computers to do the complex pattern-matching of protein to DNA. Sequence data is fed into a software "pipeline" (written in Perl) which creates a set of predicted gene locations and saves them in a MySQL database for subsequent analysis and display.

An important aspect of the Ensembl philosophy is that these data should be freely accessible to the world research community. All the data and code produced by the Ensembl project is available to download, and there is also a publicly accessible database server allowing remote access. In addition, a website [http://www.ensembl.org www.ensembl.org] provides computer-generated visual displays of much of the data.

Since the project's beginning, its remit has expanded to include additional species (including key model organisms such as mouse, fruitfly and zebrafish) as well as a wider range of genomic data, including genetic variations and regulatory features. From late 2008 a new project, Ensembl Genomes, will be extending the scope of Ensembl into plants, fungi, bacteria and protists, whilst the original project continues to focus on vertebrates.

Displaying genomic data

Central to the Ensembl concept is the ability to automatically generate graphical views of the alignment of genes and other genomic data against a reference genome. These are shown as data tracks, and individual tracks can be turned on and off, allowing the user to customise the display to suit their research interests. The interface also enables the user to zoom in to a region or move along the genome in either direction.

Other displays show data at varying levels of resolution, from whole karyotypes down to text-based representations of DNA and amino acid sequences, or present other types of display such as trees of similar genes ( homologues) across a range of species. The graphics are complemented by tabular displays, and in many cases data can be exported directly from the page in a variety of standard file formats such as FASTA.

Externally produced data can also be added to the display, either via a DAS (Distributed Annotation System) server on the internet, or by uploading a suitable file in one of the supported formats, such as BED or PSL.

Graphics are generated using a suite of custom Perl modules based on GD, the standard Perl graphics display library.

Alternative access methods

In addition its website, Ensembl provides a Perl API (Application Programming Interface) that models biological objects such as genes and proteins, allowing simple scripts to be written to retrieve data of interest. This software can be used to access the public MySQL database, avoiding the need to download enormous datasets.

Current species

The annotated genomes include most fully sequenced vertebrates and selected model organisms. All of them are eukaryotes, there are no prokaryotes. Currently this includes:

* Chordates
** Mammals
*** Primates: Bushbaby, Chimp, Human, Macaque, Mouse Lemur, Orangutan, Tarsier
*** Rodents "etc".: Guineapig, Kangaroo rat, Mouse, Pika, Rabbit, Rat, Ground Squirrel, Tree shrew
*** Laurasiatheria: Alpaca, Cat, Cow, Dog, Dolphin, Hedgehog, Horse, Megabat, Microbat, Shrew, Pig (pre)
*** Afrotheria: Elephant, Hyrax, Tenrec
*** Xenarthra: Armadillo
*** Marsupials & Monotremes: Opossum, Platypus
** Birds: Chicken
** Fish: Takifugu rubripes (Fugu), Tetraodon nigroviridis (Green spotted pufferfish), Danio rerio (Zebrafish), Oryzias latipes (Medaka), Gasterosteus aculeatus (Stickleback), Petromyzon marinus (Sea lamprey) (pre)
** Reptiles & Amphibians: Xenopus tropicalis, Anole Lizard (pre)
** Ancient relatives: Ciona intestinalis, Ciona savignyi
* Invertebrates
** Insects: Anopheles gambiae (Mosquito), Fruitfly, Aedes aegypti (Mosquito)
** Worm: Caenorhabditis elegans
* Yeast: Saccharomyces cerevisiae (Baker's yeast)

ee also

* Sequence analysis
* Sequence profiling tool
* Sequence motif

External links

* [http://www.ensembl.org Ensembl]
* [http://vega.sanger.ac.uk Vega]
* [http://pre.ensembl.org Pre-Ensembl]
* [http://www.ensemblgenomes.org Ensembl genomes]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • ENSEMBL — est un système bio informatique d annotation automatique de génomes. C est un projet conjoint de l European Bioinformatics Institute (EBI) et du Wellcome Trust Sanger Institute dont l idée centrale est d organiser de vastes champs d information… …   Wikipédia en Français

  • Ensembl — est un système bio informatique d annotation automatique de génomes. C est un projet conjoint de l European Bioinformatics Institute (EBI) et du Wellcome Trust Sanger Institute dont l idée centrale est d organiser de vastes champs d information… …   Wikipédia en Français

  • Ensembl — ist ein bioinformatisches Forschungsprojekt, welches darauf abzielt Software zu entwickeln, welche automatisch Vermerke zum eukaryotischen Genom anlegt und pflegt . Es wird in Zusammenarbeit mit dem Wellcome Trust Sanger Institute und dem… …   Deutsch Wikipedia

  • Ensembl — Saltar a navegación, búsqueda Ensembl es un proyecto de investigación bioinformática que trata de desarrollar un sistema de software que produzca y mantenga anotaciones automáticas en los genomas eucariotas seleccionados . Funciona como una… …   Wikipedia Español

  • Ensembl — — совместный проект Европейской Лаборатории молекулярной биологии (EMBL EBI, Germany) и Центра Сэнгера (Sanger Centre, UK) для разработки программного продукта, цель которого создание описаний (аннотаций) геномов эукариотов и их автоматического… …   Генетика. Энциклопедический словарь

  • DECIPHER — This article is about the biological database. For other uses, see Decipher (disambiguation). A segment of the human reference genome, viewed using Ensembl with the DECIPHER track enabled. Red bars represent individual mutations for anonymous… …   Wikipedia

  • Chromosomen — Metaphase Chromosomen aus einer menschlichen, weiblichen Lymphozytenzelle, Färbung mit dem Fluoreszenzfarbstoff Chromomycin A3. Die Chromosomen liegen teilweise übereinander. Jedes …   Deutsch Wikipedia

  • Chromosomenanzahl — Metaphase Chromosomen aus einer menschlichen, weiblichen Lymphozytenzelle, Färbung mit dem Fluoreszenzfarbstoff Chromomycin A3. Die Chromosomen liegen teilweise übereinander. Jedes …   Deutsch Wikipedia

  • Metazentrisches Chromosom — Metaphase Chromosomen aus einer menschlichen, weiblichen Lymphozytenzelle, Färbung mit dem Fluoreszenzfarbstoff Chromomycin A3. Die Chromosomen liegen teilweise übereinander. Jedes …   Deutsch Wikipedia

  • 5-alpha-Reduktase — Steroid 5α Reduktase 1 Größe 259 Aminosäuren; 29,5 kDa Bezeichner Gen Namen …   Deutsch Wikipedia


Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.