A learner's notebook.: Short read sequencing or Long read sequencing?

Genome sequencing is a hot topic these days. Currently, the popular method of sequencing is to generate millions of short reads, typically 50 to 150 nucleotides long, and then assemble the reads in computational approach. Illumina, almost having a monopoly in sequencing business, follows this strategy. However, this strategy has some drawbacks. For example, it reads genome from multiple cells, and the biological signals in those cells are averaged to generate a consensus sequence. Consequently, it cannot identify the molecular-level biological differences. Moreover, this strategy does not work well with repetitive sequences or heterozygous sequences.

In contrast, long reads can be used for sequencing. These reads can be 100 times longer that short reads. Thus the long reads have fundamentally more information than short ones. Long reads can help uniquely map the reads in complex regions including repetitive elements. However, long reads currently suffers from an elevated error rate, about 15%. That means, one in every 7 or 8 bases is incorrect. Due to this limitation, long reads alone are yet not suitable for sequencing. However, a combination of short and long reads can perform much substantially better than any of the two methods.

Pacific Biosciences, a biotechnology company, focuses on long reads. They are trying to improve the error correction algorithm so that sequencing can be performed only from the long reads, without using the short reads. That would be a great achievement, as it would reduce the cost, and also enable identification of heterozygous and repetitive elements. Thus, we may expect that the the monopoly of Illumina would be reduced.

Another biotech company, Oxford Nanopore, is also in the race. They follow a different technology. They use the characteristic conductance change when single-stranded DNA passes through or near the nanopore, a small hole of the order of 1 nanometer in internal diameter. This strategy also produces long reads from single cell. Although this approach suffers from a high error rate, it has been shown in an experiment that more than 80% of the reads had perfect 50-nucleotides sections. This is impressive. If a proper error correction algorithm can be devised, Oxford Nanopore can be beat the dominance of Pacific Biosciences in the long-read field.

To make the scenario more interesting, GynapSys, another biotech company, aims at developing a small all-electronic instrument, like an iPad, that will perform all the sequencing steps, and thus reduce the sequencing time and cost.

Let's see which technology (or company) dominates the rest.

References:
1) Greenleaf,W.J. and Sidow,A. (2014) The future of sequencing: convergence of intelligent design and market Darwinism. Genome Biol., 15, 303.
2) http://www.genengnews.com/insight-and-intelligenceand153/the-long-and-the-short-of-dna-sequencing/77899725/
3) http://www.fluidigm.com/december-31-2013.html
4) http://allseq.com/knowledgebank/emerging-technologies/genapsys

A learner's notebook.

April 2, 2014

Short read sequencing or Long read sequencing?

No comments:

Post a Comment