Nanopore base-calling from a perspective of instance segmentation

Mapping Intimacies ◽

10.1101/694919 ◽

2019 ◽

Author(s):

Yao-zhong Zhang ◽

Arda Akdemir ◽

Georg Tremmel ◽

Seiya Imoto ◽

Satoru Miyano ◽

...

Keyword(s):

Supplementary Information ◽

Sequencing Technology ◽

One Dimensional ◽

Sequential Dependencies ◽

Base Calling ◽

Third Generation Sequencing ◽

Pass Through ◽

Segmentation Task ◽

Generation Sequencing ◽

Instance Segmentation

AbstractBackgroundNanopore sequencing is a rapidly developing third-generation sequencing technology, which can generate long nucleotide reads of molecules within a portable device in real time. Through detecting the change of ion currency signals during a DNA/RNA fragment’s pass through a nanopore, genotypes are determined. Currently, the accuracy of nanopore base-calling has a higher error rate than short-read base-calling. Through utilizing deep neural networks, the-state-of-the art nanopore base-callers achieve base-calling accuracy in a range from 85% to 95%.ResultIn this work, we proposed a novel base-calling approach from a perspective of instance segmentation. Different from the previous sequence labeling approaches, we formulated the base-calling problem as a multi-label segmentation task. Meanwhile, we proposed a refined U-net model which we call UR-net that can model sequential dependencies for a one-dimensional segmentation task. The experiment results show that the proposed base-caller URnano achieves competitive results compared to recently proposed CTC-featured base-caller Chiron, on the same amount of training and test data for in-domain evaluation. Our results show that formulating the base-calling problem as a one-dimensional segmentation task is a promising approach.AvailabilityThe source code and data are available at https://github.com/yaozhong/[email protected] informationSupplementary data are available at attachment online.

Download Full-text

Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm

Bioinformatics ◽

10.1093/bioinformatics/btaa179 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3669-3679 ◽

Cited By ~ 3

Author(s):

Can Firtina ◽

Jeremie S Kim ◽

Mohammed Alser ◽

Damla Senol Cali ◽

A Ercument Cicek ◽

...

Keyword(s):

Genome Analysis ◽

Supplementary Information ◽

Third Generation ◽

Sequencing Technology ◽

Base Pairs ◽

Sequencing Technologies ◽

Third Generation Sequencing ◽

Long Reads ◽

Generation Sequencing ◽

Large Genomes

Abstract Motivation Third-generation sequencing technologies can sequence long reads that contain as many as 2 million base pairs. These long reads are used to construct an assembly (i.e. the subject’s genome), which is further used in downstream genome analysis. Unfortunately, third-generation sequencing technologies have high sequencing error rates and a large proportion of base pairs in these long reads is incorrectly identified. These errors propagate to the assembly and affect the accuracy of genome analysis. Assembly polishing algorithms minimize such error propagation by polishing or fixing errors in the assembly by using information from alignments between reads and the assembly (i.e. read-to-assembly alignment information). However, current assembly polishing algorithms can only polish an assembly using reads from either a certain sequencing technology or a small assembly. Such technology-dependency and assembly-size dependency require researchers to (i) run multiple polishing algorithms and (ii) use small chunks of a large genome to use all available readsets and polish large genomes, respectively. Results We introduce Apollo, a universal assembly polishing algorithm that scales well to polish an assembly of any size (i.e. both large and small genomes) using reads from all sequencing technologies (i.e. second- and third-generation). Our goal is to provide a single algorithm that uses read sets from all available sequencing technologies to improve the accuracy of assembly polishing and that can polish large genomes. Apollo (i) models an assembly as a profile hidden Markov model (pHMM), (ii) uses read-to-assembly alignment to train the pHMM with the Forward–Backward algorithm and (iii) decodes the trained model with the Viterbi algorithm to produce a polished assembly. Our experiments with real readsets demonstrate that Apollo is the only algorithm that (i) uses reads from any sequencing technology within a single run and (ii) scales well to polish large assemblies without splitting the assembly into multiple parts. Availability and implementation Source code is available at https://github.com/CMU-SAFARI/Apollo. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Analysis of prospective microbiology research using third-generation sequencing technology

Biodiversity Science ◽

10.17520/biods.2018201 ◽

2019 ◽

Vol 27 (5) ◽

pp. 534-542

Author(s):

Xu Yakun ◽

◽

Ma Yue ◽

Hu Xiaoxi ◽

Wang Jun ◽

...

Keyword(s):

Third Generation ◽

Sequencing Technology ◽

Generation Sequencing Technology ◽

Third Generation Sequencing ◽

Microbiology Research ◽

Generation Sequencing

Download Full-text

Third generation sequencing: technology and its potential impact on evolutionary biodiversity research

Systematics and Biodiversity ◽

10.1080/14772000.2015.1099575 ◽

2015 ◽

Vol 14 (1) ◽

pp. 1-8 ◽

Cited By ~ 82

Author(s):

Christoph Bleidorn

Keyword(s):

Third Generation ◽

Sequencing Technology ◽

Biodiversity Research ◽

Generation Sequencing Technology ◽

Third Generation Sequencing ◽

Potential Impact ◽

Generation Sequencing

Download Full-text

Trichoderma reesei Rad51 tolerates mismatches in hybrid meiosis with diverse genome sequences

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2007192118 ◽

2021 ◽

Vol 118 (8) ◽

pp. e2007192118

Author(s):

Wan-Chen Li ◽

Chia-Yi Lee ◽

Wei-Hsuan Lan ◽

Tai-Ting Woo ◽

Hou-Cheng Liu ◽

...

Keyword(s):

Trichoderma Reesei ◽

Single Molecule ◽

Amino Acid Residues ◽

Sequencing Technology ◽

Structural Variations ◽

Genome Sequences ◽

Third Generation Sequencing ◽

L1 And L2 ◽

Generation Sequencing ◽

Better Than

Most eukaryotes possess two RecA-like recombinases (ubiquitous Rad51 and meiosis-specific Dmc1) to promote interhomolog recombination during meiosis. However, some eukaryotes have lost Dmc1. Given that mammalian and yeast Saccharomyces cerevisiae (Sc) Dmc1 have been shown to stabilize recombination intermediates containing mismatches better than Rad51, we used the Pezizomycotina filamentous fungus Trichoderma reesei to address if and how Rad51-only eukaryotes conduct interhomolog recombination in zygotes with high sequence heterogeneity. We applied multidisciplinary approaches (next- and third-generation sequencing technology, genetics, cytology, bioinformatics, biochemistry, and single-molecule biophysics) to show that T. reesei Rad51 (TrRad51) is indispensable for interhomolog recombination during meiosis and, like ScDmc1, TrRad51 possesses better mismatch tolerance than ScRad51 during homologous recombination. Our results also indicate that the ancestral TrRad51 evolved to acquire ScDmc1-like properties by creating multiple structural variations, including via amino acid residues in the L1 and L2 DNA-binding loops.

Download Full-text

The study of transcriptomes of symbiotic tissue of pea using the third-generation sequencing technology Oxford Nanopore

Abstract book of the 2nd International Scientific Conference "Plants and Microbes: the Future of Biotechnology" PLAMIC2020 ◽

10.28983/plamic2020.093 ◽

2020 ◽

Author(s):

E. S. Gribchenko

Keyword(s):

Nanopore Sequencing ◽

Third Generation ◽

Nitrogen Fixing ◽

Sequencing Technology ◽

The Third ◽

Third Generation Sequencing ◽

Oxford Nanopore ◽

Mycorrhizal Roots ◽

Gene Isoforms ◽

Generation Sequencing

The transcriptome profiles the cv. Frisson mycorrhizal roots and inoculated nitrogen-fixing nodules were investigated using the Oxford Nanopore sequencing technology. A database of gene isoforms and their expression has been created.

Download Full-text

BubbleGun: Enumerating Bubbles and Superbubbles in Genome Graphs

10.1101/2021.03.23.436631 ◽

2021 ◽

Author(s):

Fawaz Dabbaghie ◽

Jana Ebler ◽

Tobias Marschall

Keyword(s):

De Novo ◽

General Purpose ◽

Supplementary Information ◽

De Bruijn Graph ◽

De Bruijn Graphs ◽

Third Generation Sequencing ◽

Human Sample ◽

Fast Development ◽

De Bruijn ◽

Generation Sequencing

AbstractMotivationWith the fast development of third generation sequencing machines, de novo genome assembly is becoming a routine even for larger genomes. Graph-based representations of genomes arise both as part of the assembly process, but also in the context of pangenomes representing a population. In both cases, polymorphic loci lead to bubble structures in such graphs. Detecting bubbles is hence an important task when working with genomic variants in the context of genome graphs.ResultsHere, we present a fast general-purpose tool, called BubbleGun, for detecting bubbles and superbubbles in genome graphs. Furthermore, BubbleGun detects and outputs runs of linearly connected bubbles and superbubbles, which we call bubble chains. We showcase its utility on de Bruijn graphs and compare our results to vg’s snarl detection. We show that BubbleGun is considerably faster than vg especially in bigger graphs, where it reports all bubbles in less than 30 minutes on a human sample de Bruijn graph of around 2 million nodes.AvailabilityBubbleGun is available and documented at https://github.com/fawaz-dabbaghieh/bubble_gun under MIT [email protected] or [email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text

Microsatellite marker development in Spanish mackerel Scomberomorus commerson using third generation sequencing technology

Molecular Biology Reports ◽

10.1007/s11033-020-05975-6 ◽

2020 ◽

Vol 47 (12) ◽

pp. 10005-10014

Author(s):

Linu Joy ◽

Sunitha Paulose ◽

PR Divya ◽

Charan Ravi ◽

VS Basheer ◽

...

Keyword(s):

Microsatellite Marker ◽

Marker Development ◽

Third Generation ◽

Sequencing Technology ◽

Spanish Mackerel ◽

Generation Sequencing Technology ◽

Third Generation Sequencing ◽

Scomberomorus Commerson ◽

Generation Sequencing

Download Full-text

Detecting Streptococcus suis by nanopore sequencing in endophthalmitis: A case report

European Journal of Inflammation ◽

10.1177/20587392211002657 ◽

2021 ◽

Vol 19 ◽

pp. 205873922110026

Author(s):

Yi-Yan Wang ◽

Qiong Huang ◽

Yang Cheng

Keyword(s):

Case Report ◽

Traditional Method ◽

Culture Method ◽

Streptococcus Suis ◽

Nanopore Sequencing ◽

Pathogenic Microorganism ◽

Sequencing Technology ◽

Third Generation Sequencing ◽

Infectious Endophthalmitis ◽

Generation Sequencing

Endophthalmitis is a rare and infectious disease caused by Streptococcus suis (S suis). Traditionally, S suis is detected by the pathogenic microorganism culture method, which has low positivity and high false negativity. Nanopore sequencing (NS), which is a third-generation sequencing technology, has several advantages over the traditional method; in particular, it is cost and time effective and has a high throughput. In this report, a case of infectious endophthalmitis caused by trauma is examined. The NS results suggest that the pathogen in question is a mixed infection caused by S suis and Clostridium perfringens. This case report provides evidence of the fact that NS can quickly identify pathogens, which is of great significance for clinical diagnosis and treatment.

Download Full-text

Recovery of human gut microbiota genomes with third-generation sequencing

Cell Death and Disease ◽

10.1038/s41419-021-03829-y ◽

2021 ◽

Vol 12 (6) ◽

Author(s):

Yanfei Li ◽

Yueling Jin ◽

Jianming Zhang ◽

Haoying Pan ◽

Lan Wu ◽

...

Keyword(s):

Gut Microbiota ◽

Third Generation ◽

Sequencing Technology ◽

Bacterial Genomes ◽

Human Gut ◽

Human Gut Microbiota ◽

Third Generation Sequencing ◽

Large Numbers ◽

Generation Sequencing

AbstractHuman gut microbiota modulates normal physiological functions, such as maintenance of barrier homeostasis and modulation of metabolism, as well as various chronic diseases including type 2 diabetes and gastrointestinal cancer. Despite decades of research, the composition of the gut microbiota remains poorly understood. Here, we established an effective extraction method to obtain high quality gut microbiota genomes, and analyzed them with third-generation sequencing technology. We acquired a large quantity of data from each sample and assembled large numbers of reliable contigs. With this approach, we constructed tens of completed bacterial genomes in which there were several new bacteria species. We also identified a new conditional pathogen, Enterococcus tongjius, which is a member of Enterococci. This work provided a novel and reliable approach to recover gut microbiota genomes, facilitating the discovery of new bacteria species and furthering our understanding of the microbiome that underlies human health and diseases.

Download Full-text

Recovery of Human Gut Microbiota Genomes Substantially With Third-generation Sequencing

10.21203/rs.3.rs-87441/v1 ◽

2020 ◽

Author(s):

Yanfei Li ◽

Yueling Jin ◽

Haoying Pan ◽

Jianming Zhang ◽

Lan Wu ◽

...

Keyword(s):

Gut Microbiota ◽

Genomic Dna ◽

Extraction Method ◽

Third Generation ◽

Sequencing Technology ◽

Human Gut ◽

Third Generation Sequencing ◽

Health And Disease ◽

Generation Sequencing

Abstract BackgroundHuman gut microbiota modulates normal physiological functions, such as the maintenance of barrier homeostasis and the modulation of metabolism, and various chronic diseases including type 2 diabetes and gastrointestinal cancer. Despite decades of researches, the composition of the gut microbiota remains unexplored and unidentified. ResultsHere we established an effective extraction method to obtain high-quality gut microbiota genomic DNA and detected the samples with third-generation sequencing technology. We acquired a quite big data form each sample and assembled many reliable contigs. Not only enormous unknown genes, but also several new bacteria subspecies or species were identified. ConclusionsThis work provides a novel and reliable framework to recover gut microbiota genomes substantially, facilitating the understanding of the roles of the microbiome that underlie in human health and disease.

Download Full-text