NanoDJ: A Dockerized Jupyter Notebook for Interactive Oxford Nanopore MinION Sequence Manipulation and Genome Assembly

Mapping Intimacies ◽

10.1101/586842 ◽

2019 ◽

Author(s):

Héctor Rodríguez-Pérez ◽

Tamara Hernández-Beeftink ◽

José M. Lorenzo-Salazar ◽

José L. Roda-García ◽

Carlos J. Pérez-González ◽

...

Keyword(s):

Quality Control ◽

Sequence Analysis ◽

Dna Sequence ◽

Dna Sequences ◽

Genome Assembly ◽

Interactive Visualization ◽

Hybrid Assembly ◽

Genomic Technologies ◽

Oxford Nanopore ◽

Oxford Nanopore Technologies

AbstractBackgroundThe Oxford Nanopore Technologies (ONT) MinION portable sequencer makes it possible to use cutting-edge genomic technologies in the field and the academic classroom.ResultsWe present NanoDJ, a Jupyter notebook integration of tools for simplified manipulation and assembly of DNA sequences produced by ONT devices. It integrates basecalling, read trimming and quality control, simulation and plotting routines with a variety of widely used aligners and assemblers, including procedures for hybrid assembly.ConclusionsWith the use of Jupyter-facilitated access to self-explanatory contents of applications and the interactive visualization of results, as well as by its distribution into a Docker software container, NanoDJ is aimed to simplify and make more reproducible ONT DNA sequence analysis. The NanoDJ package code, documentation and installation instructions are freely available at https://github.com/genomicsITER/NanoDJ.

Download Full-text

Optimizing experimental design for genome sequencing and assembly with Oxford Nanopore Technologies

10.1101/2020.05.05.079327 ◽

2020 ◽

Author(s):

John M. Sutton ◽

Janna L. Fierst

Keyword(s):

Experimental Design ◽

Genome Sequencing ◽

Dna Sequences ◽

Genome Assembly ◽

De Novo ◽

Error Rates ◽

Oxford Nanopore ◽

Broad Array ◽

Sequencing Strategy ◽

Oxford Nanopore Technologies

SummaryHigh quality reference genome sequences are the core of modern genomics. Oxford Nanopore Technologies (ONT) produces inexpensive DNA sequences in excess of 100,000 nucleotides but error rates remain >10% and assembling these sequences, particularly for eukaryotes, is a non-trivial problem. To date there has been no comprehensive attempt to generate experimental design for ONT genome sequencing and assembly. Here, we simulate ONT and Illumina DNA sequence reads for Escherichia coli, Caenorhabditis elegans, Arabidopsis thaliana, and Drosophila melanogaster. We quantify the influence of sequencing coverage, assembly software and experimental design on de novo genome assembly and error correction to predict the optimum sequencing strategy for these organisms. We show proof of concept using real ONT data generated for the nematode Caenorhabditis remanei. ONT sequencing is inexpensive and accessible, and our quantitative results will be helpful for a broad array of researchers seeking guidance for de novo genome assembly projects.

Download Full-text

Chromosome-level genome assembly of the female western mosquitofish (Gambusia affinis)

GigaScience ◽

10.1093/gigascience/giaa092 ◽

2020 ◽

Vol 9 (8) ◽

Cited By ~ 1

Author(s):

Feng Shao ◽

Arne Ludwig ◽

Yang Mao ◽

Ni Liu ◽

Zuogang Peng

Keyword(s):

Dna Sequences ◽

Genome Assembly ◽

Gambusia Affinis ◽

Comparative Genomic ◽

Suitable Model ◽

High Quality ◽

Western Mosquitofish ◽

Poeciliid Fish ◽

Oxford Nanopore ◽

Chromosome Level

Abstract Background The western mosquitofish (Gambusia affinis) is a sexually dimorphic poeciliid fish known for its worldwide biological invasion and therefore an important research model for studying invasion biology. This organism may also be used as a suitable model to explore sex chromosome evolution and reproductive development in terms of differentiation of ZW sex chromosomes, ovoviviparity, and specialization of reproductive organs. However, there is a lack of high-quality genomic data for the female G. affinis; hence, this study aimed to generate a chromosome-level genome assembly for it. Results The chromosome-level genome assembly was constructed using Oxford nanopore sequencing, BioNano, and Hi-C technology. G. affinis genomic DNA sequences containing 217 contigs with an N50 length of 12.9 Mb and 125 scaffolds with an N50 length of 26.5 Mb were obtained by Oxford nanopore and BioNano, respectively, and the 113 scaffolds (90.4% of scaffolds containing 97.9% nucleotide bases) were assembled into 24 chromosomes (pseudo-chromosomes) by Hi-C. The Z and W chromosomes of G. affinis were identified by comparative genomic analysis of female and male G. affinis, and the mechanism of differentiation of the Z and W chromosomes was explored. Combined with transcriptome data from 6 tissues, a total of 23,997 protein-coding genes were predicted and 23,737 (98.9%) genes were functionally annotated. Conclusions The high-quality female G. affinis reference genome provides a valuable omics resource for future studies of comparative genomics and functional genomics to explore the evolution of Z and W chromosomes and the reproductive developmental biology of G. affinis.

Download Full-text

Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies

10.21203/rs.3.rs-712747/v1 ◽

2021 ◽

Author(s):

Arang Rhie ◽

Ann Mc Cartney ◽

Kishwar Shafin ◽

Michael Alonge ◽

Andrey Bzikadze ◽

...

Keyword(s):

Genome Assembly ◽

Tandem Repeats ◽

Hydatidiform Mole ◽

Segmental Duplications ◽

Sequencing Technologies ◽

Oxford Nanopore ◽

Human Genome Assembly ◽

Long Read ◽

Genome Assemblies ◽

Oxford Nanopore Technologies

Abstract Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Though derived from highly accurate sequencing, evaluation revealed that the initial T2T draft assembly had evidence of small errors and structural misassemblies. To correct these errors, we designed a novel repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly QV to 73.9. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both PacBio HiFi and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies

Download Full-text

Estimation of Similarity between DNA Sequences and Its Graphical Representation

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f9389.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 43-51

Keyword(s):

Sequence Analysis ◽

Molecular Biology ◽

Dna Sequence ◽

Dna Sequences ◽

Graphical Representation ◽

Similarity Analysis ◽

Biological Sequence ◽

Field Of Study ◽

Biological Sequence Analysis ◽

Wide Range

Bioinformatics, which is now a well known field of study, originated in the context of biological sequence analysis. Recently graphical representation takes place for the research on DNA sequence. Research in biological sequence is mainly based on the function and its structure. Bioinformatics finds wide range of applications specifically in the domain of molecular biology which focuses on the analysis of molecules viz. DNA, RNA, Protein etc. In this review, we mainly deal with the similarity analysis between sequences and graphical representation of DNA sequence.

Download Full-text

Complete Circular Genome Sequences of Brachyspira hyodysenteriae Isolates of the Four Different Sequence Types Causing Swine Dysentery in Switzerland

Microbiology Resource Announcements ◽

10.1128/mra.00847-21 ◽

2021 ◽

Vol 10 (39) ◽

Author(s):

Ana B. García-Martín ◽

Sarah Schmitt ◽

Friederike Zeeh ◽

Vincent Perreten

Keyword(s):

High Throughput Sequencing ◽

De Novo ◽

Hybrid Assembly ◽

Swine Dysentery ◽

Content Type ◽

Brachyspira Hyodysenteriae ◽

Oxford Nanopore ◽

Sequencing Platforms ◽

Sequence Types ◽

Oxford Nanopore Technologies

The complete genomes of four Brachyspira hyodysenteriae isolates of the four different sequence types (STs) (ST6, ST66, ST196, and ST197) causing swine dysentery in Switzerland were generated by whole-genome sequencing and de novo hybrid assembly of reads obtained from second (Illumina) and third (Oxford Nanopore Technologies and Pacific Biosciences) high-throughput sequencing platforms.

Download Full-text

Systematics of Scleranthus (Caryophyllaceae)

10.26686/wgtn.16958875.v1 ◽

2021 ◽

Author(s):

◽

Robin David Smissen

Keyword(s):

Sequence Analysis ◽

Dna Sequence ◽

Dna Sequences ◽

Sequence Divergence ◽

Morphological Characters ◽

Dna Sequence Analysis ◽

Its Sequences ◽

Data Sets ◽

Nuclear Its ◽

Floral Characters

<p>Scleranthus is a genus of about 12 species of herbaceous flowering plants or small shrubs with a disjunct Eurasian/Australasian distribution. Monophyly of the genus is supported by the close similarity of gynoecial development of all species and consistent with nuclear ITS DNA sequence analysis. Traditionally the genus had been divided into two sections, section Scleranthus and section Mniarum. Section Mniarum is exclusively Australasian while section Scleranthus has been circumscribed to contain exclusively European species or a combination of European and Australasian species. Pollen and floral characters align the species into Australasian and Eurasian groups also supported by nuclear ITS DNA sequence analysis. Section Scleranthus as more broadly defined (i.e., sensu West and Garnock-Jones, 1986) is therefore at least paraphyletic or at worst polypyhyletic. Phylogenetic reconstructions based on morphological characters differ from those based on ITS sequences in supporting different relationships within the Australasian species of Scleranthus. Hybridisation and introgression within the genus are discussed and suggested as the cause of discordance between morphology and DNA sequence based trees. Low sequence divergence among Scleranthus ITS sequences suggests that the European and Australasian clades within the genus diverged within the last l0 million years. Biogeographic implications of these dating and competing hypotheses explaining the disjunct North-South distribution of the genus are discussed. Nuclear ITS and chloroplast ndhF DNA sequences both suggest that Scleranthus belongs to a clade within the family Caryophyllaceae consisting of members of subfamilies Alsinoideae and Caryophylloideae. Phylogenetic relationships between genera belonging to the three subfamilies of Caryophyllaceae (Alsinoideae, Caryophyloideae, and Paronychioideae) are addressed in this thesis through ndhF sequence analysis, which provides no support for the monophyly of traditionally recognised groups. Morphological character data sets are likely to always encompass multiple incongruent data partitions (sensu Bull et al. 1993). It may therefore be appropriate to combine data from DNA sequence and morphology for parsimony analysis even where the two are significantly incongruent.</p>

Download Full-text

Complete Closed Genome Sequence of the Inulin-Utilizing Lactiplantibacillus plantarum Strain Lp900, Obtained Using a Hybrid Nanopore and Illumina Assembly

Microbiology Resource Announcements ◽

10.1128/mra.00185-21 ◽

2021 ◽

Vol 10 (17) ◽

Author(s):

Jori Fuhren ◽

Reindert Nijland ◽

Michiel Wels ◽

Jos Boekhorst ◽

Michiel Kleerebezem

Keyword(s):

Lactic Acid ◽

Lactic Acid Bacteria ◽

Dna Sequence ◽

Genome Sequence ◽

De Novo ◽

Circular Chromosome ◽

Content Type ◽

Diverse Species ◽

Oxford Nanopore ◽

Oxford Nanopore Technologies

Lactiplantibacillus plantarum is a genetically and phenotypically diverse species of lactic acid bacteria. We announce the hybrid de novo assembly of Oxford Nanopore Technologies and Illumina DNA sequence reads, producing a closed circular chromosome of 3,206,992 bp and six plasmids of the inulin-utilizing L. plantarum strain Lp900.

Download Full-text

LongQC: A Quality Control Tool for Third Generation Sequencing Long Read Data

G3 Genes|Genome|Genetics ◽

10.1534/g3.119.400864 ◽

2020 ◽

Vol 10 (4) ◽

pp. 1193-1196

Author(s):

Yoshinori Fukasawa ◽

Luca Ermini ◽

Hai Wang ◽

Karen Carty ◽

Min-Sin Cheung

Keyword(s):

Quality Control ◽

Third Generation ◽

Third Generation Sequencing ◽

Oxford Nanopore ◽

Quality Control Tool ◽

Long Read ◽

Automated Quality Control ◽

Oxford Nanopore Technologies ◽

Generation Sequencing ◽

Control Tool

We propose LongQC as an easy and automated quality control tool for genomic datasets generated by third generation sequencing (TGS) technologies such as Oxford Nanopore technologies (ONT) and SMRT sequencing from Pacific Bioscience (PacBio). Key statistics were optimized for long read data, and LongQC covers all major TGS platforms. LongQC processes and visualizes those statistics automatically and quickly.

Download Full-text

On Rényi entropies of order statistics

International Journal of Biomathematics ◽

10.1142/s1793524515500801 ◽

2015 ◽

Vol 08 (06) ◽

pp. 1550080 ◽

Cited By ~ 2

Author(s):

Richa Thapliyal ◽

H. C. Taneja

Keyword(s):

Distribution Function ◽

Sequence Analysis ◽

Order Statistics ◽

Dna Sequence ◽

Dna Sequences ◽

Biological Systems ◽

Dna Sequence Analysis ◽

Entropy Measure ◽

Entropy Measures ◽

Cumulative Residual

In this paper we consider a generalize dynamic entropy measure and prove that this measure characterizes the distribution function uniquely. Also we propose cumulative residual Rényi entropy of order statistics and prove that it also determines the distribution function uniquely. Applications of entropy concepts to DNA sequence analysis, the ultimate support for the biological systems, have been widely explored by researchers. The entropy measures discussed here can be applied for analysis of ordered DNA sequences.

Download Full-text

Long reads from Nanopore sequencing as a tool for animal microbiome studies

10.1101/2019.12.21.886028 ◽

2019 ◽

Author(s):

Beatriz Delgado ◽

Magdalena Serrano ◽

Carmen González ◽

Alex Bach ◽

Oscar González-Recio

Keyword(s):

Illumina Sequencing ◽

Dna Sequences ◽

Abundant Species ◽

Rumen Microbiota ◽

Initial Cost ◽

Long Reads ◽

Oxford Nanopore ◽

Eukaryotic Species ◽

Sequencing Technique ◽

Oxford Nanopore Technologies

AbstractIn the era of bioinformatics and metagenomics, the study of the ruminal microbiome has gained considerable relevance in the field of animal breeding, since the composition of the rumen microbiota significantly impacts production and the environment. Illumina sequencing is considered the gold standard for the analysis of microbiomes, but it is limited by obtaining only short DNA sequences to analyze. As an alternative, Oxford Nanopore Technologies (ONT) has developed a new sequencing technique based on nanopores that can be carried out in the MinION, a portable device with a low initial cost which long DNA readings can be obtained with. The aim of this study was to compare the performance of both types of sequencing applied to samples of ruminal content using a similar pipeline. The ONT sequencing provided similar results to the Illumina sequencing, although it was able to classify a greater number of readings at the species level, possibly due to the increase in the read size. The results also suggest that, due to the size of the reads, it would be possible to obtain the same amount of information in a smaller number of hours. However, detection of archaeal and eukaryotic species is still difficult to accomplish due to their low abundance in the rumen compared to bacteria, suggesting different pipelines and strategies are needed to obtain a whole representation of the less abundant species in the rumen microbiota.

Download Full-text