A novel method for profiling short tandem repeats (STR) in personal genomes has been developed in America, bringing STR identification into the 21st Century.
The three stage technique – named lobSTR – accurately and simultaneously profiles more than 100,000 STRs from a human genome in one day. In makes use of the fact that each person’s set of STRs – collections of repeated two to six nucleotide-long sequences – are different from everyone else’s.
The technique harnesses concepts from signal processing and statistical learning to avoid gapped alignment and to address specific noise in STR calling. To create a DNA fingerprint, lobSTR scans the entire genome to identify all STRs and what nucleotide pattern is repeated within those stretches of DNA. Next it notes the non-repeating sequences flanking either end of the STRs before removing any noise, to produce an accurate description of the STRs’ configuration.
The speed and reliability of lobSTR exceeded the performance of mainstream algorithms for STR profiling, and its ability to accurately and efficiently describe thousands of STRs in one genome has opened up many new research opportunities.
“lobSTR found that in one human genome, 55% of the STRs are polymorphic, they showed some difference, which is very surprising,” said Yaniv Erlich, a fellow at the Whitehead Institute for Biomedical Research.
“Usually DNA’s polymorphism rate is very low because most DNA is identical between two people. With this tool, we provide access to tens of thousands of quickly changing markers that you couldn’t get before, and those can be used in medical genetics, population genetics and forensics.”
The next step is to characterise the amount of STR variation in individuals and populations, said Melissa Gymrek, first author of the paper published in Genome Research. This could provide information about the normal range of STR alleles at each locus, which would be useful in medical genetics studies aimed at determining if a given allele is normal or pathogenic. Research could also involve looking at STRs in case and control studies to look for STRs associated with disease.