tailfindr: Alignment-free poly(A) length measurement for Oxford Nanopore RNA and DNA sequencing

Mapping Intimacies ◽

10.1101/588343 ◽

2019 ◽

Cited By ~ 1

Author(s):

Maximilian Krause ◽

Adnan M. Niazi ◽

Kornel Labun ◽

Yamila N. Torres Cleuren ◽

Florian S. Müller ◽

...

Keyword(s):

Dna Sequencing ◽

Nuclear Export ◽

Messenger Rna ◽

Tail Length ◽

R Package ◽

Full Length ◽

Sequencing Data ◽

Oxford Nanopore ◽

Wide Range ◽

Rna And Dna

Polyadenylation at the 3’-end is a major regulator of messenger RNA and its length is known to affect nuclear export, stability and translation, among others. Only recently, strategies have emerged that allow for genome-wide poly(A) length assessment. These methods identify genes connected to poly(A) tail measurements indirectly by short-read alignment to genetic 3’-ends. Concurrently Oxford Nanopore Technologies (ONT) established full-length isoform RNA sequencing containing the entire poly(A) tail. However, assessing poly(A) length through basecalling has so far not been possible due the inability to resolve long homopolymeric stretches in ONT sequencing.Here we presenttailfindr, an R package to estimate poly(A) tail length on ONT long-read sequencing data.tailfindroperates on unaligned, basecalled data. It measures poly(A) tail length from both native RNA and DNA sequencing, which makes poly(A) tail studies by full-length cDNA approaches possible for the first time. We assesstailfindr’sperformance across different poly(A) lengths, demonstrating thattailfindris a versatile tool providing poly(A) tail estimates across a wide range of sequencing conditions.

Download Full-text

De novo identification and sequence assembly of high-copy tandem repeats in raw data Oxford Nanopore plant DNA sequencing data

Systems Biology and Bioinformatics (SBB-2020) : The Twelfth International Young Scientists School ◽

10.18699/sbb-2020-17 ◽

2020 ◽

Keyword(s):

Dna Sequencing ◽

Tandem Repeats ◽

De Novo ◽

Sequence Assembly ◽

Sequencing Data ◽

Raw Data ◽

Plant Dna ◽

Oxford Nanopore

Download Full-text

Real-time monitoring and analysis of SARS-CoV-2 nanopore sequencing with minoTour.

10.1101/2021.09.13.459777 ◽

2021 ◽

Author(s):

Rory James Munro ◽

Nadine Holmes ◽

Christopher Moore ◽

Matthew Carlile ◽

Alex Payne ◽

...

Keyword(s):

Real Time ◽

Phylogenetic Trees ◽

Sequencing Data ◽

Time Analysis ◽

Real Time Analysis ◽

Oxford Nanopore ◽

Individual Snps ◽

Wide Range ◽

Time Required ◽

Viral Sequencing

Motivation: The ongoing SARS-CoV-2 pandemic has demonstrated the utility of real-time analysis of sequencing data, with a wide range of databases and resources for analysis now available. Here we show how the real-time nature of Oxford Nanopore Technologies sequencers can accelerate consensus generation, lineage and variant status assignment. We exploit the fact that multiplexed viral sequencing libraries quickly generate sufficient data for the majority of samples, with diminishing returns on remaining samples as the sequencing run progresses. We demonstrate methods to determine when a sequencing run has passed this point in order to reduce the time required and cost of sequencing. Results: We extended MinoTour, our real-time analysis and monitoring platform for nanopore sequencers, to provide SARS-CoV2 analysis using ARTIC network pipelines. We additionally developed an algorithm to predict which samples will achieve sufficient coverage, automatically running the ARTIC medaka informatics pipeline once specific coverage thresholds have been reached on these samples. After testing on run data, we find significant run time savings are possible, enabling flow cells to be used more efficiently and enabling higher throughput data analysis. The resultant consensus genomes are assigned both PANGO lineage and variant status as defined by Public Health England. Samples from within individual runs are used to generate phylogenetic trees incorporating optional background samples as well as summaries of individual SNPs. As minoTour uses ARTIC pipelines, new primer schemes and pathogens can be added to allow minoTour to aid in real-time analysis of pathogens in the future.

Download Full-text

iTALK: an R Package to Characterize and Illustrate Intercellular Communication

10.1101/507871 ◽

2019 ◽

Cited By ~ 26

Author(s):

Yuanxin Wang ◽

Ruiping Wang ◽

Shaojun Zhang ◽

Shumei Song ◽

Changying Jiang ◽

...

Keyword(s):

Intercellular Communication ◽

Cell Communication ◽

R Package ◽

Therapy Resistance ◽

Computational Approach ◽

Sequencing Data ◽

Communication Signals ◽

Cellular Processes ◽

Wide Range ◽

Single Cell Rna Sequencing

ABSTRACTCrosstalk between tumor cells and other cells within the tumor microenvironment (TME) plays a crucial role in tumor progression, metastases, and therapy resistance. We present iTALK, a computational approach to characterize and illustrate intercellular communication signals in the multicellular tumor ecosystem using single-cell RNA sequencing data. iTALK can in principle be used to dissect the complexity, diversity, and dynamics of cell-cell communication from a wide range of cellular processes.

Download Full-text

Parallel and scalable workflow for the analysis of Oxford Nanopore direct RNA sequencing datasets

10.1101/818336 ◽

2019 ◽

Author(s):

Luca Cozzuto ◽

Huanle Liu ◽

Leszek P. Pryszcz ◽

Toni Hermoso Pulido ◽

Julia Ponomarenko ◽

...

Keyword(s):

Rna Sequencing ◽

Single Molecule ◽

Tail Length ◽

Rna Modification ◽

Sequencing Data ◽

Polya Tail ◽

Sequencing Platform ◽

Rna Molecules ◽

Oxford Nanopore ◽

Quality Filtering

ABSTRACTThe direct RNA sequencing platform offered by Oxford Nanopore Technologies allows for direct measurement of RNA molecules without the need of conversion to complementary DNA, fragmentation or amplification. As such, it is virtually capable of detecting any given RNA modification present in the molecule that is being sequenced, as well as provide polyA tail length estimations at the level of individual RNA molecules. Although this technology has been publicly available since 2017, the complexity of the raw Nanopore data, together with the lack of systematic and reproducible pipelines, have greatly hindered the access of this technology to the general user. Here we address this problem by providing a fully benchmarked workflow for the analysis of direct RNA sequencing reads, termed MasterOfPores. The pipeline converts raw current intensities into multiple types of processed data, providing metrics of the quality of the run, quality-filtering, base-calling and mapping. The output of the pipeline can in turn be used to compute per-gene counts, RNA modifications, and prediction of polyA tail length and RNA isoforms. The software is written using the NextFlow framework for parallelization and portability, and relies on Linux containers such as Docker and Singularity for achieving better reproducibility. The MasterOfPores workflow can be executed on any Unix-compatible OS on a computer, cluster or cloud without the need of installing any additional software or dependencies, and is freely available in Github (https://github.com/biocorecrg/master_of_pores). This workflow will significantly simplify the analysis of nanopore direct RNA sequencing data by non-bioinformatics experts, thus boosting the understanding of the (epi)transcriptome with single molecule resolution.

Download Full-text

NanoR: a user-friendly R package to analyze and compare nanopore sequencing data

10.1101/514232 ◽

2019 ◽

Author(s):

Davide Bolognini ◽

Niccolò Bartalucci ◽

Alessandra Mingrino ◽

Alessandro Maria Vannucchi ◽

Alberto Magi

Keyword(s):

Real Time ◽

Low Cost ◽

R Package ◽

Sequencing Data ◽

High Performing ◽

Dna And Rna ◽

Oxford Nanopore ◽

The One ◽

User Friendly ◽

Oxford Nanopore Technologies

AbstractMinION and GridION X5 from Oxford Nanopore Technologies are devices for real-time DNA and RNA sequencing. On the one hand, MinION is the only real-time, low cost and portable sequencing device and, thanks to its unique properties, is becoming more and more popular among biologists; on the other, GridION X5, mainly for its costs, is less widespread but highly suitable for researchers with large sequencing projects. Despite the fact that Oxford Nanopore Technologies’ devices have been increasingly used in the last few years, there is a lack of high-performing and user-friendly tools to handle the data outputted by both MinION and GridION X5 platforms. Here we present NanoR, a cross-platform R package designed with the purpose to simplify and improve nanopore data visualization. Indeed, NanoR is built on few functions but overcomes the capabilities of existing tools to extract meaningful informations from MinION sequencing data; in addition, as exclusive features, NanoR can deal with GridION X5 sequencing outputs and allows comparison of both MinION and GridION X5 sequencing data in one command. NanoR is released as free package for R at https://github.com/davidebolo1993/NanoR.

Download Full-text

DR2S: An Integrated Algorithm Providing Reference-Grade Haplotype Sequences from Heterozygous Samples

10.1101/2020.11.09.374140 ◽

2020 ◽

Author(s):

Steffen Klasberg ◽

Alexander H. Schmidt ◽

Vinzenz Lange ◽

Gerhard Schöfl

Keyword(s):

Allelic Variation ◽

R Package ◽

Full Length ◽

Reference Sequence ◽

Read Length ◽

Sequencing Data ◽

High Quality ◽

Reference Allele ◽

Sequencing Technologies ◽

Generation Sequencing

AbstractBackgroundHigh resolution HLA genotyping of donors and recipients is a crucially important prerequisite for haematopoetic stem-cell transplantation and relies heavily on the quality and completeness of immuno-genetic reference sequence databases of allelic variation.ResultsHere, we report on DR2S, an R package that leverages the strengths of two sequencing technologies – the accuracy of next-generation sequencing with the read length of third-generation sequencing technologies like PacBio’s SMRT sequencing or ONT’s nanopore sequencing – to reconstruct fully-phased high-quality full-length haplotype sequences. Although optimised for HLA and KIR genes, DR2S is applicable to all loci with known reference sequences provided that full-length sequencing data is available for analysis. In addition, DR2S integrates supporting tools for easy visualisation and quality control of the reconstructed haplotype to ensure suitability for submission to public allele databases.ConclusionsDR2S is a largely automated workflow designed to create high-quality fully-phased reference allele sequences for highly polymorphic gene regions such as HLA or KIR. It has been used by biologists to successfully characterise and submit more than 500 HLA alleles and more than 500 KIR alleles to the IPD-IMGT/HLA and IPD-KIR databases.

Download Full-text

Telomere-to-telomere genome assembly of asparaginase-producing Trichoderma simmonsii

BMC Genomics ◽

10.1186/s12864-021-08162-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Dawoon Chung ◽

Yong Min Kwon ◽

Youngik Yang

Keyword(s):

Rna Sequencing ◽

Draft Genome ◽

Sequencing Analysis ◽

Sequencing Data ◽

Terrestrial Plants ◽

Protein Coding ◽

Oxford Nanopore ◽

Wide Range ◽

The Family ◽

Encoding Genes

Abstract Background Trichoderma is a genus of fungi in the family Hypocreaceae and includes species known to produce enzymes with commercial use. They are largely found in soil and terrestrial plants. Recently, Trichoderma simmonsii isolated from decaying bark and decorticated wood was newly identified in the Harzianum clade of Trichoderma. Due to a wide range of applications in agriculture and other industries, genomes of at least 12 Trichoderma spp. have been studied. Moreover, antifungal and enzymatic activities have been extensively characterized in Trichoderma spp. However, the genomic information and bioactivities of T. simmonsii from a particular marine-derived isolate remain largely unknown. While we screened for asparaginase-producing fungi, we observed that T. simmonsii GH-Sj1 strain isolated from edible kelp produced asparaginase. In this study, we report a draft genome of T. simmonsii GH-Sj1 using Illumina and Oxford Nanopore technologies. Furthermore, to facilitate biotechnological applications of this species, RNA-sequencing was performed to elucidate the transcriptional profile of T. simmonsii GH-Sj1 in response to asparaginase-rich conditions. Results We generated ~ 14 Gb of sequencing data assembled in a ~ 40 Mb genome. The T. simmonsii GH-Sj1 genome consisted of seven telomere-to-telomere scaffolds with no sequencing gaps, where the N50 length was 6.4 Mb. The total number of protein-coding genes was 13,120, constituting ~ 99% of the genome. The genome harbored 176 tRNAs, which encode a full set of 20 amino acids. In addition, it had an rRNA repeat region consisting of seven repeats of the 18S-ITS1–5.8S-ITS2–26S cluster. The T. simmonsii genome also harbored 7 putative asparaginase-encoding genes with potential medical applications. Using RNA-sequencing analysis, we found that 3 genes among the 7 putative genes were significantly upregulated under asparaginase-rich conditions. Conclusions The genome and transcriptome of T. simmonsii GH-Sj1 established in the current work represent valuable resources for future comparative studies on fungal genomes and asparaginase production.

Download Full-text

clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers

Genome Biology ◽

10.1186/s13059-019-1645-z ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 22

Author(s):

Kieran R. Campbell ◽

Adi Steif ◽

Emma Laks ◽

Hans Zahn ◽

Daniel Lai ◽

...

Keyword(s):

Dna Sequencing ◽

Single Cell ◽

Sequencing Data ◽

Human Cancers ◽

Statistical Integration ◽

Rna And Dna

Download Full-text

tailfindr: alignment-free poly(A) length measurement for Oxford Nanopore RNA and DNA sequencing

RNA ◽

10.1261/rna.071332.119 ◽

2019 ◽

Vol 25 (10) ◽

pp. 1229-1241 ◽

Cited By ~ 11

Author(s):

Maximilian Krause ◽

Adnan M. Niazi ◽

Kornel Labun ◽

Yamila N. Torres Cleuren ◽

Florian S. Müller ◽

...

Keyword(s):

Dna Sequencing ◽

Length Measurement ◽

Alignment Free ◽

Oxford Nanopore ◽

Rna And Dna

Download Full-text

Processing human frontal cortex brain tissue for population-scale Oxford Nanopore long-read DNA sequencing SOP v1

10.17504/protocols.io.b2ucqesw ◽

2021 ◽

Author(s):

Kimberley J J Billingsley ◽

Ramita Dewan ◽

Laksh Malik ◽

Pilar Alvarez Jerez ◽

Stith Kiley ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Dna Sequencing ◽

Frontal Cortex ◽

Brain Tissue ◽

Sequencing Data ◽

Oxford Nanopore ◽

Human Frontal Cortex ◽

Long Read ◽

Population Scale

Processing human frontal cortex brain tissue for population-scale Oxford Nanopore long-read DNA sequencing SOP At the NIH's Center for Alzheimer's and Related Dementias (CARD) https://card.nih.gov/research-programs/long-read-sequencing we will generate long-read sequencing data from roughly 4000 patients with Alzheimer's disease, frontotemporal dementia, Lewy body dementia, and healthy subjects. With this research, we will build a public resource consisting of long-read genome sequencing data from a large number of confirmed people with Alzheimer's disease and related dementias and healthy individuals. To generate this large-scale nanopore sequencing data we have developed a protocol for processing and long-read sequencing human frontal cortex brain tissue, targeting an N50 of ~30kb and ~30X coverage. †Correspondence to: Kimberley Billingsley [email protected] and Cornelis Blauwendraat [email protected] Acknowledgements: We would like to thank the Nanopore team (Androo Markham &Hannah Lucio), Circulomics Inc team (Jeffrey Burke, Michelle Kim, Duncan Kilburn & Kelvin Liu) and the whole CARD long-read team listed below => UCSC: Benedict Paten, Mikhail Kolmogorov, Miten Jain, Kishwar Shafin, Trevor Pesout; NHGRI: Adam Phillippy, Arang Rhie; Baylor: Fritz Sedlazeck; JHU: Winston Timp; NINDS: Sonja Scholz; NIA: Cornelis Blauwendraat, Kimberley Billingsley, Frank Grenn, Pilar Alvarez Jerez, Bryan Traynor, Shannon Ballard, Caroline Pantazis; CZI: Paolo Carnevali.

Download Full-text