Stockholm Bioinformatics Center, SBC
Lecture notes: Structural biochemistry and bioinformatics 2001

Lecture 23 Nov 2001, Per Kraulis

The genomes

1. The genomes: why?

A genome is the completely (or almost completely) determined DNA sequence of the genetic material (chromosomes as well as any plasmids, mitochondrial DNA, etc) of an organism. The word is somewhat of a misnomer: a genome isn't the same as 'all genes', it is rather 'sequence of all DNA' wherein all genes can be found.

The first genome of a free living organism (viruses aside) was that of Haemophilus influenzae published in 1995 (Fleischmann et al, Science (1995) vol 269, pp 496-512).

Why are complete genomes interesting? The most basic answer to that question is that we want to know the complete set of genes that an organism has. The genome of an organismis in a certain sense the blueprint for that organism. Many observations and experiments in biology involve mutants and mutations, and knowing the complete set of genes for an organism can help with the analysis. For example, we may want to be sure that a knocked-out gene does not have a backup copy somewhere in the genome.

There is a deeper answer: knowing the complete genome for an organism is only the first step in the complete mapping of the constituents and processes of the organism. The complete genome is a necessary (but not sufficient) requirement for understanding an organism.

And yet another answer to the question is emerging: the availability of more and more complete genomes allows entirely new kinds of comparisons to be made between organism. New types of analysis can be applied to old questions in biology, involving problems in evolutionary history the interactions between species.

From the beginning of the genome era in 1995, there has been and is still a strong element of competition in the scientific work. Also, strong commercial aspects are influencing the current climate. This was most clearly illustrated by the competition and sometimes conflict between the publicly funded Human Genome Project, and the company Celera Genomics Corporation.

However, the cost and effort required to sequence a genome, especially a bacterial genome, is rapidly diminishing, as new and/or improved technologies and tools are developed. We will soon find ourselves in a situation where the availability of a genome is going to be considered a basic requirement for working on a specific organism.

Currently (Nov 2001), on the order of 60-80 genomes have been completed and published and are available. In the next few years, it is likely that maybe one thousand genomes (mostly bacterial, but also many higher organisms) will be available. The human genome is available in large parts, although the finishing process will continue for maybe a few years.

Copyright © 2001 Per Kraulis $Date: 2001/11/19 13:46:11 $