Finding: By comparing portions of the human genome with those of other mammals, researchers have discovered almost 300 previously unidentified human genes, and found extensions of several hundred genes already known.
Behind the discovery
The fundamental is the idea that as organisms evolve, sections of genetic code that do something useful for the organism change in different ways.
What is the human genome?
The complete sequence of the human genome was accomplished several years ago. That means that the 3 billion or so chemical units, called bases, that make up the order of the genetic code is known. What is not known is the identification of the exact location of all the short sections that code for proteins or perform regulatory or other functions.
The genes make proteins...the basic chemical component needed for building cells. More than 20,000 protein-coding genes have been identified. This finding is important because it shows there still could be many more genes that have been missed using current biological methods. These existing methods are very effective at finding genes that have a wide expression but may miss those that are expressed only in certain tissues or at early stages of embryonic development.
Using evolution for gene discovery
This method involves using evolution to identify these genes. Gene comparision follows evolution; it has been doing this experiment for millions of years. There are many similarities between genes of the two species. The differences can be identified. Using a computer is the microscope to observe the results.
Four different bases -- commonly referred to by the letters G, C, A and T -- make up DNA. Three bases in a row can code for an amino acid (the building blocks of proteins), and a string of these three-letter codes can be a gene, coding for a string of amino acids that a cell can make into a protein.
Siepel and colleagues set out to find genes that have been "conserved" -- that are fundamental to all life and that have stayed the same, or nearly so, over millions of years of evolution.
The researchers started with "alignments" discovered by other workers -- stretches up to several thousand bases long that are mostly alike across two or more species.
Over millions of years, individual bases can be swapped -- C to G, T to A, for example -- by damage or miscopying. Changes that alter the structure of a protein can kill the organism or send it down a dead-end evolutionary path. But conserved genes contain only minor changes that leave the protein able to do its job. The computer looked for regions with those sorts of changes by creating a mathematical model of how the gene might have changed, then looking for matches to this model.