Stockholm Bioinformatics Center, SBC
Lecture notes: Structural biochemistry and bioinformatics 2001

Lecture 23 Nov 2001, Per Kraulis

The genomes

2. The current genomes

The list of currently determined genomes is growing rapidly, probably at an exponential rate. This means that it is beginning to be difficult to keep updated. Luckily, there are a number of web sites that (try to) keep track of the status of the various genome projects. Here are links to some of them:

There are a number of web sites that focus on data for specific genomes. Usually, these sites contain more data than just the DNA sequence of the genome, such as predicted transcripts (ORFs, Open Reading Frames), verified transcripts and tentative identifications or classifications of the predicted proteins.

Here is a table of a few complete genomes, with information and links. Please note that the number of ORFs given below for each genome is tentative. The numbers depend on the exact procedure used to identify known genes and predict previously unknown genes (see the section Analysing a genome).

Organism Type Size Links Comment
Haemophilus influenzae Bacterial 1.83 Mb, 1850 ORFs The Haemophilus influenzae Genome Database at TIGR. The first genome of a free-living organism. 1995
Escherichia coli Bacterial 4.64 Mb, 4289 ORFs E.coli Genome Project University of Wisconsin-Madison The most studied bacterium. 1997
Rickettsia prowazekii Bacterial 1.11 Mb, 834 ORFs The first genome to be sequenced in Sweden (Siv Andersson, Uppsala). 1998
Methanococcus jannaschii Archaeal 1.66 Mb, 1750 ORFs The Methanococcus jannaschii Genome Database at TIGR. The first sequenced Archaea. 1996
Saccharomyces cerevisiae Eukaryote 12.1 Mb (16 chromosomes), 6294 ORFs SGD, MIPS yeast DB The first sequenced eukaryote. 1997
Caenorhabditis elegans Eukaryote, nematode 97 Mb (6 chromosomes), 19,099 ORFs WormBase, C. elegans Genome Project The first sequenced multicellular organism.
Drosophila melanogaster Eukaryote, insect 137 Mb (excluding heterochromatin, 6 chromosomes), 14,100 ORFs BDGP, Flybase Celera Corp, publicly available. 2000
Homo sapiens Eukaryote, primate 3000 Mb (24 chromosomes), ? ORFs HGP at Sanger, HGP at Oak Ridge Rough draft exists. Not yet finished, except for chromosomes 21 and 22.

Copyright © 2001 Per Kraulis $Date: 2001/11/19 13:46:14 $