Lecture notes: Molecular Bioinformatics 2001, Uppsala University

Lecture 26 Jan 2001 Per Kraulis

The genomes

2. The current genomes

The list of currently determined genomes is growing rapidly, probably at an exponential rate. This means that it is beginning to be difficult to keep updated. Luckily, there are a number of web sites that (try to) keep track of the status of the various genome projects. Here are links to some of them:

There are a number of web sites that focus on data for specific genomes. Usually, these sites contain more data than just the DNA sequence of the genome, such as predicted transcripts (ORFs, Open Reading Frames), verified transcripts and tentative identifications or classifications of the predicted proteins.

Here is a table of a few complete genomes, with information and links. Please note that the number of ORFs given below for each genome is tentative. The numbers depend on the exact procedure used to identify known genes and predict previously unknown genes (see the section Analysing a genome).

Organism Type Genome size (Mb) Number of genes Links Comment
Haemophilus influenzae Bacterial 1.83 1850 Haemophilus influenzae page at TIGR. The first genome of a free-living organism. 1995
Escherichia coli Bacterial 4.64 4289 E.coli Genome Project University of Wisconsin-Madison The most studied bacterium. 1997
Rickettsia prowazekii Bacterial 1.11 834 The first genome to be sequenced in Sweden (Siv Andersson, Uppsala). 1998
Methanococcus jannaschii Archaeal 1.66 1750 Methanococcus jannaschii page at TIGR. The first sequenced Archaea. 1996
Saccharomyces cerevisiae Eukaryote 12.1 6294 SGD, MIPS yeast DB The first sequenced eukaryote. 1997
Caenorhabditis elegans Eukaryote, nematode 97 18,424 WormBase, C. elegans Genome Project The first sequenced multicellular organism. 1998
Drosophila melanogaster Eukaryote, insect 137 13,601 BDGP, Flybase Celera Corp, publicly available. 2000
Arabidopsis thaliana Eukaryote, plant 125 25,498 The Arabidopsis Information resource The first plant. 2000
Homo sapiens Eukaryote, primate 3,000 50,000 ? HGP at Sanger,
HGP at Oak Ridge
Rough draft exists. Not yet finished, except for chromosomes 21 and 22.

