scholarly journals NanoDJ: A Dockerized Jupyter Notebook for Interactive Oxford Nanopore MinION Sequence Manipulation and Genome Assembly

2019 ◽  
Author(s):  
Héctor Rodríguez-Pérez ◽  
Tamara Hernández-Beeftink ◽  
José M. Lorenzo-Salazar ◽  
José L. Roda-García ◽  
Carlos J. Pérez-González ◽  
...  

AbstractBackgroundThe Oxford Nanopore Technologies (ONT) MinION portable sequencer makes it possible to use cutting-edge genomic technologies in the field and the academic classroom.ResultsWe present NanoDJ, a Jupyter notebook integration of tools for simplified manipulation and assembly of DNA sequences produced by ONT devices. It integrates basecalling, read trimming and quality control, simulation and plotting routines with a variety of widely used aligners and assemblers, including procedures for hybrid assembly.ConclusionsWith the use of Jupyter-facilitated access to self-explanatory contents of applications and the interactive visualization of results, as well as by its distribution into a Docker software container, NanoDJ is aimed to simplify and make more reproducible ONT DNA sequence analysis. The NanoDJ package code, documentation and installation instructions are freely available at https://github.com/genomicsITER/NanoDJ.

2020 ◽  
Author(s):  
John M. Sutton ◽  
Janna L. Fierst

SummaryHigh quality reference genome sequences are the core of modern genomics. Oxford Nanopore Technologies (ONT) produces inexpensive DNA sequences in excess of 100,000 nucleotides but error rates remain >10% and assembling these sequences, particularly for eukaryotes, is a non-trivial problem. To date there has been no comprehensive attempt to generate experimental design for ONT genome sequencing and assembly. Here, we simulate ONT and Illumina DNA sequence reads for Escherichia coli, Caenorhabditis elegans, Arabidopsis thaliana, and Drosophila melanogaster. We quantify the influence of sequencing coverage, assembly software and experimental design on de novo genome assembly and error correction to predict the optimum sequencing strategy for these organisms. We show proof of concept using real ONT data generated for the nematode Caenorhabditis remanei. ONT sequencing is inexpensive and accessible, and our quantitative results will be helpful for a broad array of researchers seeking guidance for de novo genome assembly projects.


GigaScience ◽  
2020 ◽  
Vol 9 (8) ◽  
Author(s):  
Feng Shao ◽  
Arne Ludwig ◽  
Yang Mao ◽  
Ni Liu ◽  
Zuogang Peng

Abstract Background The western mosquitofish (Gambusia affinis) is a sexually dimorphic poeciliid fish known for its worldwide biological invasion and therefore an important research model for studying invasion biology. This organism may also be used as a suitable model to explore sex chromosome evolution and reproductive development in terms of differentiation of ZW sex chromosomes, ovoviviparity, and specialization of reproductive organs. However, there is a lack of high-quality genomic data for the female G. affinis; hence, this study aimed to generate a chromosome-level genome assembly for it. Results The chromosome-level genome assembly was constructed using Oxford nanopore sequencing, BioNano, and Hi-C technology. G. affinis genomic DNA sequences containing 217 contigs with an N50 length of 12.9 Mb and 125 scaffolds with an N50 length of 26.5 Mb were obtained by Oxford nanopore and BioNano, respectively, and the 113 scaffolds (90.4% of scaffolds containing 97.9% nucleotide bases) were assembled into 24 chromosomes (pseudo-chromosomes) by Hi-C. The Z and W chromosomes of G. affinis were identified by comparative genomic analysis of female and male G. affinis, and the mechanism of differentiation of the Z and W chromosomes was explored. Combined with transcriptome data from 6 tissues, a total of 23,997 protein-coding genes were predicted and 23,737 (98.9%) genes were functionally annotated. Conclusions The high-quality female G. affinis reference genome provides a valuable omics resource for future studies of comparative genomics and functional genomics to explore the evolution of Z and W chromosomes and the reproductive developmental biology of G. affinis.


2021 ◽  
Author(s):  
Arang Rhie ◽  
Ann Mc Cartney ◽  
Kishwar Shafin ◽  
Michael Alonge ◽  
Andrey Bzikadze ◽  
...  

Abstract Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Though derived from highly accurate sequencing, evaluation revealed that the initial T2T draft assembly had evidence of small errors and structural misassemblies. To correct these errors, we designed a novel repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly QV to 73.9. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both PacBio HiFi and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies


Bioinformatics, which is now a well known field of study, originated in the context of biological sequence analysis. Recently graphical representation takes place for the research on DNA sequence. Research in biological sequence is mainly based on the function and its structure. Bioinformatics finds wide range of applications specifically in the domain of molecular biology which focuses on the analysis of molecules viz. DNA, RNA, Protein etc. In this review, we mainly deal with the similarity analysis between sequences and graphical representation of DNA sequence.


2021 ◽  
Vol 10 (39) ◽  
Author(s):  
Ana B. García-Martín ◽  
Sarah Schmitt ◽  
Friederike Zeeh ◽  
Vincent Perreten

The complete genomes of four Brachyspira hyodysenteriae isolates of the four different sequence types (STs) (ST6, ST66, ST196, and ST197) causing swine dysentery in Switzerland were generated by whole-genome sequencing and de novo hybrid assembly of reads obtained from second (Illumina) and third (Oxford Nanopore Technologies and Pacific Biosciences) high-throughput sequencing platforms.


2021 ◽  
Author(s):  
◽  
Robin David Smissen

<p>Scleranthus is a genus of about 12 species of herbaceous flowering plants or small shrubs with a disjunct Eurasian/Australasian distribution. Monophyly of the genus is supported by the close similarity of gynoecial development of all species and consistent with nuclear ITS DNA sequence analysis. Traditionally the genus had been divided into two sections, section Scleranthus and section Mniarum. Section Mniarum is exclusively Australasian while section Scleranthus has been circumscribed to contain exclusively European species or a combination of European and Australasian species. Pollen and floral characters align the species into Australasian and Eurasian groups also supported by nuclear ITS DNA sequence analysis. Section Scleranthus as more broadly defined (i.e., sensu West and Garnock-Jones, 1986) is therefore at least paraphyletic or at worst polypyhyletic. Phylogenetic reconstructions based on morphological characters differ from those based on ITS sequences in supporting different relationships within the Australasian species of Scleranthus. Hybridisation and introgression within the genus are discussed and suggested as the cause of discordance between morphology and DNA sequence based trees. Low sequence divergence among Scleranthus ITS sequences suggests that the European and Australasian clades within the genus diverged within the last l0 million years. Biogeographic implications of these dating and competing hypotheses explaining the disjunct North-South distribution of the genus are discussed. Nuclear ITS and chloroplast ndhF DNA sequences both suggest that Scleranthus belongs to a clade within the family Caryophyllaceae consisting of members of subfamilies Alsinoideae and Caryophylloideae. Phylogenetic relationships between genera belonging to the three subfamilies of Caryophyllaceae (Alsinoideae, Caryophyloideae, and Paronychioideae) are addressed in this thesis through ndhF sequence analysis, which provides no support for the monophyly of traditionally recognised groups. Morphological character data sets are likely to always encompass multiple incongruent data partitions (sensu Bull et al. 1993). It may therefore be appropriate to combine data from DNA sequence and morphology for parsimony analysis even where the two are significantly incongruent.</p>


2021 ◽  
Vol 10 (17) ◽  
Author(s):  
Jori Fuhren ◽  
Reindert Nijland ◽  
Michiel Wels ◽  
Jos Boekhorst ◽  
Michiel Kleerebezem

Lactiplantibacillus plantarum is a genetically and phenotypically diverse species of lactic acid bacteria. We announce the hybrid de novo assembly of Oxford Nanopore Technologies and Illumina DNA sequence reads, producing a closed circular chromosome of 3,206,992 bp and six plasmids of the inulin-utilizing L. plantarum strain Lp900.


2020 ◽  
Vol 10 (4) ◽  
pp. 1193-1196
Author(s):  
Yoshinori Fukasawa ◽  
Luca Ermini ◽  
Hai Wang ◽  
Karen Carty ◽  
Min-Sin Cheung

We propose LongQC as an easy and automated quality control tool for genomic datasets generated by third generation sequencing (TGS) technologies such as Oxford Nanopore technologies (ONT) and SMRT sequencing from Pacific Bioscience (PacBio). Key statistics were optimized for long read data, and LongQC covers all major TGS platforms. LongQC processes and visualizes those statistics automatically and quickly.


2015 ◽  
Vol 08 (06) ◽  
pp. 1550080 ◽  
Author(s):  
Richa Thapliyal ◽  
H. C. Taneja

In this paper we consider a generalize dynamic entropy measure and prove that this measure characterizes the distribution function uniquely. Also we propose cumulative residual Rényi entropy of order statistics and prove that it also determines the distribution function uniquely. Applications of entropy concepts to DNA sequence analysis, the ultimate support for the biological systems, have been widely explored by researchers. The entropy measures discussed here can be applied for analysis of ordered DNA sequences.


2019 ◽  
Author(s):  
Beatriz Delgado ◽  
Magdalena Serrano ◽  
Carmen González ◽  
Alex Bach ◽  
Oscar González-Recio

AbstractIn the era of bioinformatics and metagenomics, the study of the ruminal microbiome has gained considerable relevance in the field of animal breeding, since the composition of the rumen microbiota significantly impacts production and the environment. Illumina sequencing is considered the gold standard for the analysis of microbiomes, but it is limited by obtaining only short DNA sequences to analyze. As an alternative, Oxford Nanopore Technologies (ONT) has developed a new sequencing technique based on nanopores that can be carried out in the MinION, a portable device with a low initial cost which long DNA readings can be obtained with. The aim of this study was to compare the performance of both types of sequencing applied to samples of ruminal content using a similar pipeline. The ONT sequencing provided similar results to the Illumina sequencing, although it was able to classify a greater number of readings at the species level, possibly due to the increase in the read size. The results also suggest that, due to the size of the reads, it would be possible to obtain the same amount of information in a smaller number of hours. However, detection of archaeal and eukaryotic species is still difficult to accomplish due to their low abundance in the rumen compared to bacteria, suggesting different pipelines and strategies are needed to obtain a whole representation of the less abundant species in the rumen microbiota.


Sign in / Sign up

Export Citation Format

Share Document