SquiggleKit: A toolkit for manipulating nanopore signal data

Mapping Intimacies ◽

10.1101/549741 ◽

2019 ◽

Cited By ~ 1

Author(s):

James M. Ferguson ◽

Martin A. Smith

Keyword(s):

Signal Processing ◽

Nucleotide Sequence ◽

Signal Analysis ◽

Data Extraction ◽

Nanopore Sequencing ◽

Sequence Motifs ◽

Sequencing Data ◽

Signal Space ◽

Memory Footprint ◽

Cross Platform

SummaryThe management of raw nanopore sequencing data poses a challenge that must be overcome to accelerate the development of new bioinformatics algorithms predicated on signal analysis. SquiggleKit is a toolkit for manipulating and interrogating nanopore data that simplifies file handling, data extraction, visualisation, and signal processing. Its modular tools can be used to reduce file numbers and memory footprint, identify poly-A tails, target barcodes, adapters, and find nucleotide sequence motifs in raw nanopore signal, amongst other applications. SquiggleKit serves as a bioinformatics portal into signal space, for novice and experienced users alike. It is comprehensively documented, simple to use, cross-platform compatible and freely available from (https://github.com/Psy-Fer/SquiggleKit).

Download Full-text

SquiggleKit: a toolkit for manipulating nanopore signal data

Bioinformatics ◽

10.1093/bioinformatics/btz586 ◽

2019 ◽

Cited By ~ 2

Author(s):

James M Ferguson ◽

Martin A Smith

Keyword(s):

Signal Processing ◽

Signal Analysis ◽

Data Extraction ◽

Supplementary Information ◽

Nanopore Sequencing ◽

Supplementary Data ◽

Sequencing Data ◽

Cross Platform ◽

The Creation

Abstract Summary The management of raw nanopore sequencing data poses a challenge that must be overcome to facilitate the creation of new bioinformatics algorithms predicated on signal analysis. SquiggleKit is a toolkit for manipulating and interrogating nanopore data that simplifies file handling, data extraction, visualization and signal processing. Availability and implementation SquiggleKit is cross platform and freely available from GitHub at (https://github.com/Psy-Fer/SquiggleKit). Detailed documentation can be found at (https://psy-fer.github.io/SquiggleKitDocs/). All tools have been designed to operate in python 2.7+, with minimal additional libraries. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Real-time demultiplexing Nanopore barcoded sequencing data with npBarcode

10.1101/134155 ◽

2017 ◽

Cited By ~ 1

Author(s):

Son Hoang Nguyen ◽

Tania Duarte ◽

Lachlan J. M. Coin ◽

Minh Duc Cao

Keyword(s):

Real Time ◽

Nanopore Sequencing ◽

Sequencing Data ◽

Analysis Pipeline ◽

Bioinformatic Tools ◽

Oxford Nanopore ◽

Pooled Sequencing ◽

Cross Platform ◽

Real Time Applications ◽

Friendly Graphical User Interface

AbstractMotivationThe recently introduced barcoding protocol to Oxford Nanopore sequencing has increased the versatility of the technology. Several bioinformatic tools have been developed to demultiplex the barcoded reads, but none of them support the streaming analysis. This limits the use of pooled sequencing in real-time applications, which is one of the main advantages of the technology.ResultsWe introduced npBarcode, an open source and cross platform tool for barcode demultiplex in streaming fashion. npBarcode can be seamlessly integrated into a streaming analysis pipeline. The tool also provides a friendly graphical user interface through npReader, allowing the real-time visual monitoring of the sequencing progress of barcoded samples. We show that npBarcode achieves comparable accuracies to the other alternatives.AvailabilitynpBarcode is bundled in Japsa - a Java tools kit for genome analysis, and is freely available at https://github.com/hsnguyen/npBarcode.

Download Full-text

An Error Correction Method of Nanopore Sequencing Data Using Deep Learning

2020 13th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) ◽

10.1109/cisp-bmei51763.2020.9263622 ◽

2020 ◽

Author(s):

Luotong Wang ◽

Li Qu ◽

Longshu Yang ◽

Yiying Wang ◽

Huaiqiu Zhu

Keyword(s):

Deep Learning ◽

Error Correction ◽

Correction Method ◽

Nanopore Sequencing ◽

Sequencing Data ◽

Error Correction Method

Download Full-text

Sequoia: an interactive visual analytics platform for interpretation and feature extraction from nanopore sequencing datasets

BMC Genomics ◽

10.1186/s12864-021-07791-z ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Ratanond Koonchanok ◽

Swapna Vidhur Daulatabad ◽

Quoseena Mir ◽

Khairi Reda ◽

Sarath Chandra Janga

Keyword(s):

Single Molecule ◽

Visual Analytics ◽

Visual Analysis ◽

Direct Sequencing ◽

Visual Exploration ◽

Nanopore Sequencing ◽

Sequencing Data ◽

Rna Sequences ◽

Sequencing Technologies ◽

Signal Features

Abstract Background Direct-sequencing technologies, such as Oxford Nanopore’s, are delivering long RNA reads with great efficacy and convenience. These technologies afford an ability to detect post-transcriptional modifications at a single-molecule resolution, promising new insights into the functional roles of RNA. However, realizing this potential requires new tools to analyze and explore this type of data. Result Here, we present Sequoia, a visual analytics tool that allows users to interactively explore nanopore sequences. Sequoia combines a Python-based backend with a multi-view visualization interface, enabling users to import raw nanopore sequencing data in a Fast5 format, cluster sequences based on electric-current similarities, and drill-down onto signals to identify properties of interest. We demonstrate the application of Sequoia by generating and analyzing ~ 500k reads from direct RNA sequencing data of human HeLa cell line. We focus on comparing signal features from m6A and m5C RNA modifications as the first step towards building automated classifiers. We show how, through iterative visual exploration and tuning of dimensionality reduction parameters, we can separate modified RNA sequences from their unmodified counterparts. We also document new, qualitative signal signatures that characterize these modifications from otherwise normal RNA bases, which we were able to discover from the visualization. Conclusions Sequoia’s interactive features complement existing computational approaches in nanopore-based RNA workflows. The insights gleaned through visual analysis should help users in developing rationales, hypotheses, and insights into the dynamic nature of RNA. Sequoia is available at https://github.com/dnonatar/Sequoia.

Download Full-text

Complete Genome Sequences of Six Listeria monocytogenes Sequence Type 9 Isolates from Meat Processing Plants in Norway

Genome Announcements ◽

10.1128/genomea.00016-18 ◽

2018 ◽

Vol 6 (7) ◽

Cited By ~ 3

Author(s):

Annette Fagerlund ◽

Solveig Langsrud ◽

Birgitte Moen ◽

Even Heir ◽

Trond Møretrø

Keyword(s):

Listeria Monocytogenes ◽

Complete Genome ◽

Foodborne Pathogen ◽

Sequence Type ◽

Nanopore Sequencing ◽

Meat Processing ◽

Sequencing Data ◽

Genome Sequences ◽

Fatal Disease ◽

Processing Plants

ABSTRACT Listeria monocytogenes is a foodborne pathogen that causes the often-fatal disease listeriosis. We present here the complete genome sequences of six L. monocytogenes isolates of sequence type 9 (ST9) collected from two different meat processing facilities in Norway. The genomes were assembled using Illumina and Nanopore sequencing data.

Download Full-text

Acoustic Partial Discharge signal analysis using digital signal processing techniques

2013 Annual IEEE India Conference (INDICON) ◽

10.1109/indcon.2013.6725936 ◽

2013 ◽

Cited By ~ 1

Author(s):

Ripunjoy Phukan ◽

Subrata Karmakar

Keyword(s):

Signal Processing ◽

Digital Signal Processing ◽

Signal Analysis ◽

Partial Discharge ◽

Digital Signal ◽

Processing Techniques ◽

Signal Processing Techniques

Download Full-text

Rank-adaptive signal processing (RASP) a subspace approach to biological signal analysis .II. Applications

Conference Record of the Thirty-Fourth Asilomar Conference on Signals, Systems and Computers (Cat. No.00CH37154) ◽

10.1109/acssc.2000.911313 ◽

2002 ◽

Author(s):

R.J. Semmani ◽

B.F. Womack ◽

R.E. Barr

Keyword(s):

Signal Processing ◽

Signal Analysis ◽

Adaptive Signal Processing ◽

Biological Signal ◽

Adaptive Signal

Download Full-text

High resolution copy number inference in cancer using short-molecule nanopore sequencing

10.1101/2020.12.28.424602 ◽

2020 ◽

Author(s):

Timour Baslan ◽

Sam Kovaka ◽

Fritz J. Sedlazeck ◽

Yanming Zhang ◽

Robert Wappel ◽

...

Keyword(s):

Copy Number ◽

Cost Effective ◽

Chromosome Analysis ◽

Ease Of Use ◽

Precision Oncology ◽

Nanopore Sequencing ◽

Dna Molecules ◽

Sequencing Data ◽

Short Read ◽

Short Read Sequencing

ABSTRACTGenome copy number is an important source of genetic variation in health and disease. In cancer, clinically actionable Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore sequencing technologies offer the potential for broader clinical utility, for example in smaller hospitals, due to lower instrument cost, higher portability, and ease of use. Nonetheless, Nanopore sequencing devices are limited in terms of the number of retrievable sequencing reads/molecules compared to short-read sequencing platforms. This represents a challenge for applications that require high read counts such as CNA inference. To address this limitation, we targeted the sequencing of short-length DNA molecules loaded at optimized concentration in an effort to increase sequence read/molecule yield from a single nanopore run. We show that sequencing short DNA molecules reproducibly returns high read counts and allows high quality CNA inference. We demonstrate the clinical relevance of this approach by accurately inferring CNAs in acute myeloid leukemia samples. The data shows that, compared to traditional approaches such as chromosome analysis/cytogenetics, short molecule nanopore sequencing returns more sensitive, accurate copy number information in a cost effective and expeditious manner, including for multiplex samples. Our results provide a framework for the sequencing of relatively short DNA molecules on nanopore devices with applications in research and medicine, that include but are not limited to, CNAs.

Download Full-text

Analysis of fragment ends in plasma DNA from patients with cancer

10.1101/2021.04.23.21255935 ◽

2021 ◽

Author(s):

Karan K. Budhraja ◽

Bradon R. McDonald ◽

Michelle D. Stephens ◽

Tania Contente-Cuomo ◽

Havell Markus ◽

...

Keyword(s):

Nucleotide Sequence ◽

Characteristic Curve ◽

Cost Effective ◽

Cancer Diagnostics ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Plasma Dna ◽

Patients With Cancer ◽

Fragmentation Patterns ◽

Tumor Dna

AbstractFragmentation patterns observed in plasma DNA reflect chromatin accessibility in contributing cells. Since DNA shed from cancer cells and blood cells may differ in fragmentation patterns, we investigated whether analysis of genomic positioning and nucleotide sequence at fragment ends can reveal the presence of tumor DNA in blood and aid cancer diagnostics. We analyzed whole genome sequencing data from >2700 plasma DNA samples including healthy individuals and patients with 11 different cancer types. We observed higher fractions of fragments with aberrantly positioned ends in patients with cancer, driven by contribution of tumor DNA into plasma. Genomewide analysis of fragment ends using machine learning showed overall area under the receiver operative characteristic curve of 0.96 for detection of cancer. Our findings remained robust with as few as 1 million fragments analyzed per sample, suggesting that analysis of fragment ends can become a cost-effective and accessible approach for cancer detection and monitoring.One-sentence summaryAnalyzing the positioning and nucleotide sequence at fragment ends in plasma DNA may enable cancer diagnostics.

Download Full-text

Identifying common and novel cell types in single-cell RNA-sequencing data using FR-Match

10.1101/2021.10.17.464718 ◽

2021 ◽

Author(s):

Yun Zhang ◽

Brian Aevermann ◽

Rohan Gala ◽

Richard H. Scheuermann

Keyword(s):

Single Cell ◽

Cell Types ◽

Sample Type ◽

Cell Type ◽

Sequencing Data ◽

Excellent Performance ◽

Single Cell Rna Sequencing ◽

Accurate Performance ◽

Cross Platform ◽

Tissue Region

Reference cell type atlases powered by single cell transcriptomic profiling technologies have become available to study cellular diversity at a granular level. We present FR-Match for matching query datasets to reference atlases with robust and accurate performance for identifying novel cell types and non-optimally clustered cell types in the query data. This approach shows excellent performance for cross-platform, cross-sample type, cross-tissue region, and cross-data modality cell type matching.

Download Full-text