The genome club
This is the Year of the Rat. It’s also the Year
of the Dog and the Opossum. Next year will be the Year
of the Cow, the Rhesus Macaque, and maybe a few other
assorted creatures with hot blood and fur.
They’re all joining humans, mice, and chimpanzees
in the exclusive club of mammals whose whole genome
has been sequenced—giving complete and matching
sets of each animal’s DNA, and offering researchers
the oppor-tunity to rebuild biology and medicine from
the ground up.
The technology yielding this treasure works by deciphering
an organism’s genetic code, which is held in its
chromosomes in DNA “base pairs” that combine
adenine and thymine or cytosine and guanine (commonly
referred to with the letters A, T, C, and G). Mammals
have about three billion base pairs. Scientists tackle
the intimidating task of reading them all in exactly
the right order with an entirely counter-intuitive approach.
“If you think of the genome as an encyclopedia,
you break up the whole encyclopedia and get lots of
strings of letters. You then try to put the strings
together like a puzzle,” notes Kerstin Lindblad-Toh,
codirector of the Genome Sequencing and Analysis Program
at the Broad Institute in Cambridge, Massachusetts.
Scientists separate DNA fragments, read them, then
reassemble them using supercomputers and some very,
very clever programming. The researchers then repeat
the process over and over to make sure they’ve
created the most accurate genetic map possible. While
this is a Herculean task, the efficiency and volume
of current sequencing techniques “have surpassed
everybody’s wildest dreams,” comments George
Weinstock, codirector of the Human Genome Sequencing
Center at Baylor College in Houston.
The resulting mammalian DNA “parts lists”
give an effect greater than the sum of their parts,
says Jane Peterson, associate director of the National
Human Genome Research Institute’s Division of
Extramural Research.
These projects are opening the floodgates not just
for understanding individual organisms, but for comparative
genomics studies, revealing what genetic material is
conserved over time, and how these genomes relate to
each other—and to humans.
“It’s important to understand how this
effort relates to human health,” Peterson emphasizes.
“The sequenced assembly of genes at different
evolutionary distances starts to give clues as to what’s
important in the human genome and even why it’s
important. This really is the major tool for improving
understanding of the human genome.”
Mammals on the march
While organisms from yeast on up are being sequenced
in large numbers, scientists are particularly interested
in mapping mammalian genomes—especially those
mammals employed as biological research models.
Given its lead role in medical studies, “the
mouse was a no-brainer” as the national institute
picked targets, Peterson remarks. A high-quality draft
sequence was published in 2002. Since then, scientists
have re-sequenced the mouse over and over and plan to
release this final highly polished map in late 2005.
The rat, extensively studied in behavioral research
and other investigations, was another early choice.
A first draft was published this spring. Last December,
scientists published the first draft of the chimp genome
sequence.
Another critical model organism, the dog, has been
under study in two different labs. Man’s best
friend shares many diseases with humans and even lives
in the same environments. Dogs come in wildly diverse
breeds, often have well-documented pedigrees and veterinary
records, and show different susceptibilities to specific
genetic diseases. Last year, The Institute for Genomic
Research in Rockville, Maryland, sequenced the genome
of a poodle; this July, Broad Institute and partners
in the National Human Genome Research Institute completed
a much more detailed draft of a boxer.
And it won’t end there. Scientists are hard at
work to uncover the genomes of other mammals, including
the Rhesus macaque monkey (the major primate for biomedical
research) and the cow (key not just in agriculture but
in studies of everything from cardiovascular disease
to reproduction). Also in the works are plans to sequence
the gray short-tailed South American opossum, tammar
wallaby, African elephant, European common shrew, European
hedgehog, guinea pig, lesser hedgehog tenrec, nine-banded
armadillo, rabbit, cat, and orangutan. “One of
my headaches recently has been finding these organisms,”
Lindblad-Toh says wryly.
Reading a map
It’s still early days for analyzing the initial
wave of mammalian genomes. The first paper on the chimpanzee
whole genome, for instance, isn’t expected until
late this year or in early 2005. What’s more,
the mammals don’t come two by two. Analyzing male
Y chromosomes can be exasperatingly difficult due to
the chromosome’s intricate makeup, so most sequencing
projects examine DNA samples from female animals. To
fill
in some missing Ys, Whitehead collaborates with Washington
University in Saint Louis. For example, the chimp Y
should give insight into human male fertility, says
Jennifer Hughes, a postdoctoral researcher in the lab
of Whitehead Member David Page.
Even within this exclusive club of mammals, not all
animals are equal. They get quite different levels of
effort and expense, particularly in the number of times
their complete DNA is scanned and in the effort made
to fill in the trickiest gaps in the genetic code. “It
takes much more time to finish a genome than to get
the first 95 percent,” as Lindblad-Toh puts it.
At one extreme, the mouse will get about the same scrutiny
as the human, with each DNA base read at least seven
times. At the other, wallaby DNA will be scanned only
twice. Scientists in the National Human Genome Research
Institute may use this abbreviated approach to sequence
most of the other mammals on their to-do list. Some
researchers complain the approach will give insufficient
data, muddying the waters for interpretation. The pros
and cons still are being kicked around, Peterson responds.
As they start to decode the draft sequences, she adds,
scientists will get a better grip on what will be most
efficient.
Doggone important
It’s no surprise to learn that investigators
hail these new cornucopias of data. This year’s
dog draft is an “enormous step,” says Gustavo
Aguirre, professor of medical genetics and ophthalmology
at the University of Pennsylvania in Philadelphia. “It
allows us to now do work that we could only have hoped
to do several years ago.”
Previously, when Aguirre’s lab isolated a protein
of interest, the next step was to clone the gene in
the lab. Poring through a huge library of physical complementary
DNA “might take as long as six months to a year,”
Aguirre says. Now, the researchers usually can find
the gene target by crunching through the dog sequence
database.
“We’re eternally grateful for it,”
says Aguirre, “but we’re not satisfied.”
Fortunately, the advances in sequencing technologies
are keeping pace with scientists’ demands. “Now
a single center can finish a mammalian genome in a year,”
notes Baylor’s Weinstock. “Almost everything
of significant value will be sequenced at some level
in the next few years.”
|