Finding: The answers provide the first evolutionary history of the duplications in the human genome that are partly responsible for both disease and recent genetic innovations.
This work marks a significant step toward a better understanding of what genomic changes paved the way for modern humans, when these duplications occurred and what the associated costs are -- in terms of susceptibility to disease-causing genetic mutations.
Researchers have answered a similar vexing genomic question: Which of the thousands of long stretches of repeated DNA in the human genome came first? And which are the duplicates?
Genomes have an ability to copy a long stretch of DNA from one chromosome and insert it into another region of the genome. Segmental duplications hold many evolutionary secrets and uncovering them is a difficult biological and computational challenge with implications for both medicine and our understanding of evolution.
Researchers have created the first evolutionary history of the duplications in the human genome that are partly responsible for both disease and recent genetic innovations. This marks an important step toward a better understanding of what genomic changes paved the way for modern humans, when these duplications occurred and what the associated costs are - in terms of susceptibility to disease-causing genetic mutations.
In the past, the highly complex patterns of DNA duplication -- including duplications within duplications -- have prevented the construction of an evolutionary history of these long DNA duplications. To crack the duplication code and determine which of the DNA segments are originals (ancestral duplications) and which are copies (derivative duplications), the researchers looked to both algorithmic biology and comparative genomics.
Identifying the original duplications is a prerequisite to understanding what makes the human genome unstable. Researchers modified an algorithmic genome assembly technique in order to deconstruct the sequence of repeated stretches of DNA and identify the original sequences. The belief is that perhaps there may be something special about the originals, some clue or insight into what causes this colonization of the human genome.
This is the first time that we have a global view of the evolutionary origin of some of the most complicated regions of the human genome. The researchers tracked down the ancestral origin of more than two thirds of these long DNA duplications.
First, researchers suggest that specific regions of the human genome experienced elevated rates of duplication activity at different times in our recent genomic history. This contrasts with most models of genomic duplication which suggest a continuous model for recent duplications. Second, a large fraction of the recent duplication architecture centers around a rather small subset of "core duplicons" -- short segments of DNA that come together to form segmental duplications. These cores are focal points of human gene/transcript innovations.
Not all of the duplications in the human genome are created equal. Some of them -- the core duplicons -- appear to be responsible for recent genetic innovations the in human genome. Researchers uncovered 14 such core duplicons.
In 4 of the 14 cases, there is compelling evidence that genes embedded within the cores are associated with novel human gene innovations. In two cases the core duplicon has been part of novel fusion genes whose functions appear to be radically different from their antecedents.
Results suggest that the high rate of disease caused by these duplications in the normal population may be offset by the emergence of newly minted human/great-ape specific genes embedded within the duplications. The next challenge will be determining the function of these novel genes.
Mathematical Algorithms and Biological construction
Research applied their expertise in assembling genomes from millions of small fragments -- a problem that is not unlike the "mosaic decomposition" problem in analyzing duplications that the team faced.
Over the years researchers applied the 250-year old algorithmic idea first proposed by 18th century mathematician Leonhard Euler (of the fame of pi) to a variety of problems and demonstrated that it works equally well for a set of seemingly unrelated biological problems including DNA fragment assembly, reconstructing snake venoms, and now dissecting the mosaic structure of segmental duplications.