Perspectives and Challenges of Emerging Single-Molecule DNA Sequencing Technologies

Mingsheng Xu; Daisuke Fujita; Nobutaka Hanagata

doi:10.1002/smll.200900976

Single-molecule DNA sequencing technologies for future genomics research

Trends in Biotechnology ◽

10.1016/j.tibtech.2008.07.003 ◽

2008 ◽

Vol 26 (11) ◽

pp. 602-611 ◽

Cited By ~ 150

Author(s):

Pushpendra K. Gupta

Keyword(s):

Dna Sequencing ◽

Single Molecule ◽

Sequencing Technologies ◽

Genomics Research

Download Full-text

Single-Molecule Analysis Methods Using Nanogap Electrodes and Their Application to DNA Sequencing Technologies

Bulletin of the Chemical Society of Japan ◽

10.1246/bcsj.20170224 ◽

2017 ◽

Vol 90 (11) ◽

pp. 1189-1210 ◽

Cited By ~ 12

Author(s):

Masateru Taniguchi

Keyword(s):

Dna Sequencing ◽

Single Molecule ◽

Sequencing Technologies ◽

Analysis Methods ◽

Nanogap Electrodes ◽

Single Molecule Analysis

Download Full-text

Single-molecule DNA sequencing of widely varying GC-content using nucleotide release, capture and detection in microdroplets

Nucleic Acids Research ◽

10.1093/nar/gkaa987 ◽

2020 ◽

Vol 48 (22) ◽

pp. e132-e132

Author(s):

Tim J Puchtler ◽

Kerr Johnson ◽

Rebecca N Palmer ◽

Emma L Talbot ◽

Lindsey A Ibbotson ◽

...

Keyword(s):

Dna Sequencing ◽

Single Molecule ◽

Direct Detection ◽

Gc Content ◽

Cost Effective ◽

Epigenetic Modifications ◽

Fluorescence Signal ◽

Sequencing Platform ◽

Sequencing Technologies ◽

Lower Accuracy

Abstract Despite remarkable progress in DNA sequencing technologies there remains a trade-off between short-read platforms, having limited ability to sequence homopolymers, repeated motifs or long-range structural variation, and long-read platforms, which tend to have lower accuracy and/or throughput. Moreover, current methods do not allow direct readout of epigenetic modifications from a single read. With the aim of addressing these limitations, we have developed an optical electrowetting sequencing platform that uses step-wise nucleotide triphosphate (dNTP) release, capture and detection in microdroplets from single DNA molecules. Each microdroplet serves as a reaction vessel that identifies an individual dNTP based on a robust fluorescence signal, with the detection chemistry extended to enable detection of 5-methylcytosine. Our platform uses small reagent volumes and inexpensive equipment, paving the way to cost-effective single-molecule DNA sequencing, capable of handling widely varying GC-bias, and demonstrating direct detection of epigenetic modifications.

Download Full-text

Implementation and Data Analysis of Tn-seq, Whole-Genome Resequencing, and Single-Molecule Real-Time Sequencing for Bacterial Genetics

Journal of Bacteriology ◽

10.1128/jb.00560-16 ◽

2016 ◽

Vol 199 (1) ◽

Cited By ~ 10

Author(s):

Peter E. Burby ◽

Taylor M. Nye ◽

Jeremy W. Schroeder ◽

Lyle A. Simmons

Keyword(s):

Dna Sequencing ◽

Real Time ◽

Single Molecule ◽

De Novo ◽

Large Data ◽

Data Sets ◽

Bacterial Genetics ◽

Sequencing Technologies ◽

Bioinformatics Tools ◽

Whole Genome Resequencing

ABSTRACT Few discoveries have been more transformative to the biological sciences than the development of DNA sequencing technologies. The rapid advancement of sequencing and bioinformatics tools has revolutionized bacterial genetics, deepening our understanding of model and clinically relevant organisms. Although application of newer sequencing technologies to studies in bacterial genetics is increasing, the implementation of DNA sequencing technologies and development of the bioinformatics tools required for analyzing the large data sets generated remain a challenge for many. In this minireview, we have chosen to summarize three sequencing approaches that are particularly useful for bacterial genetics. We provide resources for scientists new to and interested in their application. Here, we discuss the analysis of data from transposon mutagenesis followed by deep sequencing (Tn-seq) to determine gene disruptions differentially represented in a mutant population and Illumina sequencing for identification of suppressor or other mutations, and we summarize single-molecule real-time (SMRT) sequencing for de novo genome assembly and the use of the output data for detection of DNA base modifications.

Download Full-text

Direct Read and Quantify Damage Nucleotide from an Oligonucleotide Using a Single Molecule Interface

10.26434/chemrxiv.10080638.v2 ◽

2019 ◽

Author(s):

Jiajun Wang ◽

Meng-Yin Li ◽

Jie Yang ◽

Ya-Qian Wang ◽

Xue-Yuan Wu ◽

...

Keyword(s):

Dna Sequencing ◽

Single Molecule ◽

Genetic Diseases ◽

High Sensitivity ◽

Dna Lesions ◽

Lesion Detection ◽

The Novel ◽

Structural Variations ◽

Dna Lesion ◽

Base Lesion

DNA lesion such as metholcytosine(mC), 8-OXO-guanine(OG), inosine(I) etc could cause the genetic diseases. Identification of the varieties of lesion bases are usually beyond the capability of conventional DNA sequencing which is mainly designed to discriminate four bases only. Therefore, lesion detection remain challenge due to the massive varieties and less distinguishable readouts for minor structural variations. Moreover, standard amplification and labelling hardly works in DNA lesions detection. Herein, we designed a single molecule interface from the mutant K238Q Aerolysin, whose confined sensing region shows the high compatible to capture and then directly convert each base lesion into distinguishable current readouts. Compared with previous single molecule sensing interface, the resolution of the K238Q Aerolysin nanopore is enhanced by 2-order. The novel K238Q could direct discriminate at least 3 types (mC, OG, I) lesions without lableing and quantify modification sites under mixed hetero-composition condition of oligonucleotide. Such nanopore could be further applied to diagnose genetic diseases at high sensitivity.

Download Full-text

Faculty Opinions recommendation of Continuous base identification for single-molecule nanopore DNA sequencing.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.1157396.617550 ◽

2009 ◽

Author(s):

Stephan Beck

Keyword(s):

Dna Sequencing ◽

Single Molecule ◽

Nanopore Dna Sequencing

Download Full-text

Reconstruction of Microbial Haplotypes by Integration of Statistical and Physical Linkage in Scaffolding

Molecular Biology and Evolution ◽

10.1093/molbev/msab037 ◽

2021 ◽

Cited By ~ 1

Author(s):

Chen Cao ◽

Jingni He ◽

Lauren Mak ◽

Deshan Perera ◽

Devin Kwok ◽

...

Keyword(s):

Single Molecule ◽

Human Genetics ◽

Real Data ◽

Sequencing Technologies ◽

Bacterial Genomics ◽

Physical Linkage ◽

Pooled Sequencing ◽

Computational Reconstruction ◽

Host Genetic ◽

Host Evolution

Abstract DNA sequencing technologies provide unprecedented opportunities to analyze within-host evolution of microorganism populations. Often, within-host populations are analyzed via pooled sequencing of the population, which contains multiple individuals or “haplotypes.” However, current next-generation sequencing instruments, in conjunction with single-molecule barcoded linked-reads, cannot distinguish long haplotypes directly. Computational reconstruction of haplotypes from pooled sequencing has been attempted in virology, bacterial genomics, metagenomics, and human genetics, using algorithms based on either cross-host genetic sharing or within-host genomic reads. Here, we describe PoolHapX, a flexible computational approach that integrates information from both genetic sharing and genomic sequencing. We demonstrated that PoolHapX outperforms state-of-the-art tools tailored to specific organismal systems, and is robust to within-host evolution. Importantly, together with barcoded linked-reads, PoolHapX can infer whole-chromosome-scale haplotypes from 50 pools each containing 12 different haplotypes. By analyzing real data, we uncovered dynamic variations in the evolutionary processes of within-patient HIV populations previously unobserved in single position-based analysis.

Download Full-text

Survey of the Bradysia odoriphaga Transcriptome Using PacBio Single-Molecule Long-Read Sequencing

Genes ◽

10.3390/genes10060481 ◽

2019 ◽

Vol 10 (6) ◽

pp. 481 ◽

Cited By ~ 1

Author(s):

Chen ◽

Lin ◽

Xie ◽

Zhong ◽

Zhang ◽

...

Keyword(s):

Insecticide Resistance ◽

Single Molecule ◽

Functional Categories ◽

Genetic Studies ◽

Sequencing Technologies ◽

Clusters Of Orthologous Groups ◽

Long Read ◽

Main Gene ◽

First Time ◽

Main Factor

The damage caused by Bradysia odoriphaga is the main factor threatening the production of vegetables in the Liliaceae family. However, few genetic studies of B. odoriphaga have been conducted because of a lack of genomic resources. Many long-read sequencing technologies have been developed in the last decade; therefore, in this study, the transcriptome including all development stages of B. odoriphaga was sequenced for the first time by Pacific single-molecule long-read sequencing. Here, 39,129 isoforms were generated, and 35,645 were found to have annotation results when checked against sequences available in different databases. Overall, 18,473 isoforms were distributed in 25 various Clusters of Orthologous Groups, and 11,880 isoforms were categorized into 60 functional groups that belonged to the three main Gene Ontology classifications. Moreover, 30,610 isoforms were assigned into 44 functional categories belonging to six main Kyoto Encyclopedia of Genes and Genomes functional categories. Coding DNA sequence (CDS) prediction showed that 36,419 out of 39,129 isoforms were predicted to have CDS, and 4319 simple sequence repeats were detected in total. Finally, 266 insecticide resistance and metabolism-related isoforms were identified as candidate genes for further investigation of insecticide resistance and metabolism in B. odoriphaga.

Download Full-text

Hapo-G, haplotype-aware polishing of genome assemblies with accurate reads

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab034 ◽

2021 ◽

Vol 3 (2) ◽

Author(s):

Jean-Marc Aury ◽

Benjamin Istace

Keyword(s):

Single Molecule ◽

Direct Consequence ◽

High Quality ◽

Sequencing Errors ◽

Coding Regions ◽

Sequencing Technologies ◽

Long Reads ◽

Oxford Nanopore ◽

Long Read ◽

Genome Assemblies

Abstract Single-molecule sequencing technologies have recently been commercialized by Pacific Biosciences and Oxford Nanopore with the promise of sequencing long DNA fragments (kilobases to megabases order) and then, using efficient algorithms, provide high quality assemblies in terms of contiguity and completeness of repetitive regions. However, the error rate of long-read technologies is higher than that of short-read technologies. This has a direct consequence on the base quality of genome assemblies, particularly in coding regions where sequencing errors can disrupt the coding frame of genes. In the case of diploid genomes, the consensus of a given gene can be a mixture between the two haplotypes and can lead to premature stop codons. Several methods have been developed to polish genome assemblies using short reads and generally, they inspect the nucleotide one by one, and provide a correction for each nucleotide of the input assembly. As a result, these algorithms are not able to properly process diploid genomes and they typically switch from one haplotype to another. Herein we proposed Hapo-G (Haplotype-Aware Polishing Of Genomes), a new algorithm capable of incorporating phasing information from high-quality reads (short or long-reads) to polish genome assemblies and in particular assemblies of diploid and heterozygous genomes.

Download Full-text

Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets

BMC Genomics ◽

10.1186/s12864-021-07791-z ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Ratanond Koonchanok ◽

Swapna Vidhur Daulatabad ◽

Quoseena Mir ◽

Khairi Reda ◽

Sarath Chandra Janga

Keyword(s):

Single Molecule ◽

Visual Analytics ◽

Visual Analysis ◽

Direct Sequencing ◽

Visual Exploration ◽

Nanopore Sequencing ◽

Sequencing Data ◽

Rna Sequences ◽

Sequencing Technologies ◽

Signal Features

Abstract Background Direct-sequencing technologies, such as Oxford Nanopore’s, are delivering long RNA reads with great efficacy and convenience. These technologies afford an ability to detect post-transcriptional modifications at a single-molecule resolution, promising new insights into the functional roles of RNA. However, realizing this potential requires new tools to analyze and explore this type of data. Result Here, we present Sequoia, a visual analytics tool that allows users to interactively explore nanopore sequences. Sequoia combines a Python-based backend with a multi-view visualization interface, enabling users to import raw nanopore sequencing data in a Fast5 format, cluster sequences based on electric-current similarities, and drill-down onto signals to identify properties of interest. We demonstrate the application of Sequoia by generating and analyzing ~ 500k reads from direct RNA sequencing data of human HeLa cell line. We focus on comparing signal features from m6A and m5C RNA modifications as the first step towards building automated classifiers. We show how, through iterative visual exploration and tuning of dimensionality reduction parameters, we can separate modified RNA sequences from their unmodified counterparts. We also document new, qualitative signal signatures that characterize these modifications from otherwise normal RNA bases, which we were able to discover from the visualization. Conclusions Sequoia’s interactive features complement existing computational approaches in nanopore-based RNA workflows. The insights gleaned through visual analysis should help users in developing rationales, hypotheses, and insights into the dynamic nature of RNA. Sequoia is available at https://github.com/dnonatar/Sequoia.

Download Full-text