Published online: 22 November 2006; | doi:10.1038/news061120-9

Human genome more variable than previously thought

Surprisingly large segments of DNA found to differ from person to person.

Helen Pearson

How alike are you and me? About 99.5%

Nearly six years after the sequence of the human genome was sketched out, one might assume that researchers had worked out what all that DNA means. But a new investigation has left them wondering just how similar one person's genome is to another's.

Geneticists have generally assumed that your string of DNA 'letters' is 99.9% identical to that of your neighbour's, with differences in the odd individual letter. These differences make each person genetically unique — influencing everything from appearance and personality to susceptibility to disease.

But hold on, say the authors of a new study published in Nature1. They have identified surprisingly large chunks of the genome that can differ dramatically from one person to the next. "Everyone has a unique pattern," says one of the lead authors, Matthew Hurles at the Wellcome Trust Sanger Institute in Cambridge, UK.

The differences in question - made up of stretches of DNA that span tens to hundreds of thousands of chemical letters — are called 'copy-number variants', or CNVs. Within a given stretch of DNA, one person may carry one copy of a DNA segment, another may have two, three or more. The region might be completely absent from a third person's genome. And sometimes the segments are shuffled up in different ways.


Same but different

3,080 million 'letters' of DNA in the human genome 22,205 genes, by one recent estimate 10 million single-letter changes (SNPs) — that's only 0.3% of the genome 1,447 copy-number variants, covering a surprisingly large 12% of the genome

About 99.5% similarity between two random people's DNA

They found nearly 1,500 such regions, taking up some 12% of the human genome. That doesn't mean that your DNA is 12% different from mine (or 88% similar), because any two people's DNA will differ at only a handful of these spots.

According to the team's back-of-the-envelope calculations, one person's DNA is probably 99.5% similar to their neighbour's. Or a bit less. "I've tried to do the calculation and it's very complicated," says Hurles. "It all depends on how you do the accounting."

The answer is also unclear because researchers think that there are many more variable blocks of sequence that are 10,000 or 1,000 letters long and were excluded from the current study. Because of limits with their methods, the new map mainly identified variable chunks larger than 50,000 letters long.

Many of these CNVs are thought to be important in our biology. The team found that 10% of human genes are spanned by these regions, meaning that they might be doubled, deleted or otherwise jumbled in a way that could help to determine whether and when we develop diseases.

CNVs have already been linked with susceptibility to Alzheimer's disease, kidney disease and HIV, among others, and the new map will help researchers to make connections to other conditions. "There's a general expectation that these things are quite influential," Wigler says.


Scherer and his team have already lined up the only two complete human genome sequences produced by the publicly funded Human Genome Project and the private company Celera. They identified both single-letter changes and small and large regions of variation and report their results in Nature Genetics2.

  1. Redon R., et al. Nature, 444. 444 - 454 (2006). | Article |
  2. Khaja R., et al. Nature Genetics, advance online publication doi:10.1038/ng1921 (2006).