Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American killifish from the Fundulus genus

Mapping Intimacies ◽

10.1101/686246 ◽

2019 ◽

Author(s):

Lisa K. Johnson ◽

Ruta Sahasrabudhe ◽

Tony Gill ◽

Jennifer Roach ◽

Lutz Froenicke ◽

...

Keyword(s):

North American ◽

De Novo ◽

Draft Genome ◽

Sequencing Data ◽

Sequence Coverage ◽

Illumina Platform ◽

Oxford Nanopore ◽

Fundulus Olivaceus ◽

Genome Assemblies ◽

Oxford Nanopore Technologies

AbstractDraft de novo reference genome assemblies were obtained from four North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) using sequence reads from Illumina and Oxford Nanopore Technologies’ PromethION platforms. For each species, the PromethION platform was used to generate 30-45x sequence coverage, and the Illumina platform was used to generate 50-160x sequence coverage. Contig N50 values ranged from 0.4 Mb to 2.7 Mb, and BUSCO scores were consistently above 90% complete using the Eukaryota database. Draft assemblies and raw sequencing data are available for public use. We encourage use and re-use of these data for assembly benchmarking and external analyses.

Download Full-text

Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American Fundulus killifish

GigaScience ◽

10.1093/gigascience/giaa067 ◽

2020 ◽

Vol 9 (6) ◽

Cited By ~ 3

Author(s):

Lisa K Johnson ◽

Ruta Sahasrabudhe ◽

James Anthony Gill ◽

Jennifer L Roach ◽

Lutz Froenicke ◽

...

Keyword(s):

North American ◽

De Novo ◽

Draft Genome ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Sequence Coverage ◽

Short Read ◽

Oxford Nanopore ◽

Long Read ◽

Genome Assemblies

Abstract Background Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. Findings Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30–45× sequence coverage, and the Illumina platform was used to generate 50–160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. Conclusions High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.

Download Full-text

De Novo Genome Assemblies for Three North American Bumble Bee Species: Bombus bifarius, Bombus vancouverensis, and Bombus vosnesenskii

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401437 ◽

2020 ◽

Vol 10 (8) ◽

pp. 2585-2592

Author(s):

Sam D. Heraghty ◽

John M. Sutton ◽

Meaghan L. Pimsler ◽

Janna L. Fierst ◽

James P. Strange ◽

...

Keyword(s):

Evolutionary Biology ◽

De Novo ◽

Bombus Terrestris ◽

Bumble Bee ◽

Hybrid Assembly ◽

Widespread Species ◽

Oxford Nanopore ◽

Genome Assemblies ◽

High Degree ◽

Oxford Nanopore Technologies

Bumble bees are ecologically and economically important insect pollinators. Three abundant and widespread species in western North America, Bombus bifarius, Bombus vancouverensis, and Bombus vosnesenskii, have been the focus of substantial research relating to diverse aspects of bumble bee ecology and evolutionary biology. We present de novo genome assemblies for each of the three species using hybrid assembly of Illumina and Oxford Nanopore Technologies sequences. All three assemblies are of high quality with large N50s (> 2.2 Mb), BUSCO scores indicating > 98% complete genes, and annotations producing 13,325 – 13,687 genes, comparing favorably with other bee genomes. Analysis of synteny against the most complete bumble bee genome, Bombus terrestris, reveals a high degree of collinearity. These genomes should provide a valuable resource for addressing questions relating to functional genomics and evolutionary biology in these species.

Download Full-text

Systematic Comparison of the Performances of De Novo Genome Assemblers for Oxford Nanopore Technology Reads From Piroplasm

Frontiers in Cellular and Infection Microbiology ◽

10.3389/fcimb.2021.696669 ◽

2021 ◽

Vol 11 ◽

Author(s):

Jinming Wang ◽

Kai Chen ◽

Qiaoyun Ren ◽

Ying Zhang ◽

Junlong Liu ◽

...

Keyword(s):

De Novo ◽

Next Generation Sequencing Data ◽

Sequencing Data ◽

Coverage Depth ◽

Sequence Coverage ◽

Long Reads ◽

Oxford Nanopore ◽

Generation Sequencing ◽

Easy Operation

BackgroundEmerging long reads sequencing technology has greatly changed the landscape of whole-genome sequencing, enabling scientists to contribute to decoding the genetic information of non-model species. The sequences generated by PacBio or Oxford Nanopore Technology (ONT) be assembled de novo before further analyses. Some genome de novo assemblers have been developed to assemble long reads generated by ONT. The performance of these assemblers has not been completely investigated. However, genome assembly is still a challenging task.Methods and ResultsWe systematically evaluated the performance of nine de novo assemblers for ONT on different coverage depth datasets. Several metrics were measured to determine the performance of these tools, including N50 length, sequence coverage, runtime, easy operation, accuracy of genome and genomic completeness in varying depths of coverage. Based on the results of our assessments, the performances of these tools are summarized as follows: 1) Coverage depth has a significant effect on genome quality; 2) The level of contiguity of the assembled genome varies dramatically among different de novo tools; 3) The correctness of an assembled genome is closely related to the completeness of the genome. More than 30× nanopore data can be assembled into a relatively complete genome, the quality of which is highly dependent on the polishing using next generation sequencing data.ConclusionConsidering the results of our investigation, the advantage and disadvantage of each tool are summarized and guidelines of selecting assembly tools are provided under specific conditions.

Download Full-text

Comparison of long read methods for sequencing and assembly of a plant genome

10.1101/2020.03.16.992933 ◽

2020 ◽

Cited By ~ 1

Author(s):

Valentine Murigneux ◽

Subash Kumar Rai ◽

Agnelo Furtado ◽

Timothy J.C. Bruxner ◽

Wei Tian ◽

...

Keyword(s):

De Novo ◽

Cost Effective ◽

Genome Project ◽

Plant Genome ◽

Sequencing Data ◽

Sequencing Technologies ◽

Oxford Nanopore ◽

Long Read ◽

The Cost ◽

Genome Assemblies

AbstractSequencing technologies have advanced to the point where it is possible to generate high accuracy, haplotype resolved, chromosome scale assemblies. Several long read sequencing technologies are available on the market and a growing number of algorithms have been developed over the last years to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology as well as the most appropriate software for assembly and polishing. For this reason, it is important to benchmark different approaches applied to the same sample. Here, we report a comparison of three long read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION) and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of PacBio and Nanopore reads. Results obtained from combining long read technologies or short read and long read technologies are also presented. The assemblies were compared for contiguity, accuracy and completeness as well as sequencing costs and DNA material requirements. Overall, the three long read technologies produced highly contiguous and complete genome assemblies of Macadamia jansenii. At the time of sequencing, the cost associated with each method was significantly different but continuous improvements in technologies have resulted in greater accuracy, increased throughput and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.

Download Full-text

Diversity of Pectobacteriaceae Species in Potato Growing Regions in Northern Morocco

Microorganisms ◽

10.3390/microorganisms8060895 ◽

2020 ◽

Vol 8 (6) ◽

pp. 895 ◽

Cited By ~ 2

Author(s):

Saïd Oulghazi ◽

Mohieddine Moumni ◽

Slimane Khayi ◽

Kévin Robic ◽

Sohaib Sarfraz ◽

...

Keyword(s):

Species Diversity ◽

Complete Genome ◽

Draft Genome ◽

Epidemiological Studies ◽

Gene Marker ◽

Oxford Nanopore ◽

Causative Agents ◽

Genome Analyses ◽

Oxford Nanopore Technologies

Dickeya and Pectobacterium pathogens are causative agents of several diseases that affect many crops worldwide. This work investigated the species diversity of these pathogens in Morocco, where Dickeya pathogens have only been isolated from potato fields recently. To this end, samplings were conducted in three major potato growing areas over a three-year period (2015–2017). Pathogens were characterized by sequence determination of both the gapA gene marker and genomes using Illumina and Oxford Nanopore technologies. We isolated 119 pathogens belonging to P. versatile (19%), P. carotovorum (3%), P. polaris (5%), P. brasiliense (56%) and D. dianthicola (17%). Their taxonomic assignation was confirmed by draft genome analyses of 10 representative strains of the collected species. D. dianthicola were isolated from a unique area where a wide species diversity of pectinolytic pathogens was observed. In tuber rotting assays, D. dianthicola isolates were more aggressive than Pectobacterium isolates. The complete genome sequence of D. dianthicola LAR.16.03.LID was obtained and compared with other D. dianthicola genomes from public databases. Overall, this study highlighted the ecological context from which some Dickeya and Pectobacterium species emerged in Morocco, and reported the first complete genome of a D. dianthicola strain isolated in Morocco that will be suitable for further epidemiological studies.

Download Full-text

De novo identification and sequence assembly of high-copy tandem repeats in raw data Oxford Nanopore plant DNA sequencing data

Systems Biology and Bioinformatics (SBB-2020) : The Twelfth International Young Scientists School ◽

10.18699/sbb-2020-17 ◽

2020 ◽

Keyword(s):

Dna Sequencing ◽

Tandem Repeats ◽

De Novo ◽

Sequence Assembly ◽

Sequencing Data ◽

Raw Data ◽

Plant Dna ◽

Oxford Nanopore

Download Full-text

Chasing perfection: validation and polishing strategies for telomere-to-telomere genome assemblies

10.21203/rs.3.rs-712747/v1 ◽

2021 ◽

Author(s):

Arang Rhie ◽

Ann Mc Cartney ◽

Kishwar Shafin ◽

Michael Alonge ◽

Andrey Bzikadze ◽

...

Keyword(s):

Genome Assembly ◽

Tandem Repeats ◽

Hydatidiform Mole ◽

Segmental Duplications ◽

Sequencing Technologies ◽

Oxford Nanopore ◽

Human Genome Assembly ◽

Long Read ◽

Genome Assemblies ◽

Oxford Nanopore Technologies

Abstract Advances in long-read sequencing technologies and genome assembly methods have enabled the recent completion of the first Telomere-to-Telomere (T2T) human genome assembly, which resolves complex segmental duplications and large tandem repeats, including centromeric satellite arrays in a complete hydatidiform mole (CHM13). Though derived from highly accurate sequencing, evaluation revealed that the initial T2T draft assembly had evidence of small errors and structural misassemblies. To correct these errors, we designed a novel repeat-aware polishing strategy that made accurate assembly corrections in large repeats without overcorrection, ultimately fixing 51% of the existing errors and improving the assembly QV to 73.9. By comparing our results to standard automated polishing tools, we outline common polishing errors and offer practical suggestions for genome projects with limited resources. We also show how sequencing biases in both PacBio HiFi and Oxford Nanopore Technologies reads cause signature assembly errors that can be corrected with a diverse panel of sequencing technologies

Download Full-text

Detection of Clinically Relevant Molecular Alterations in Chronic Lymphocytic Leukemia (CLL) By Nanopore Sequencing

Blood ◽

10.1182/blood-2018-99-110948 ◽

2018 ◽

Vol 132 (Supplement 1) ◽

pp. 1847-1847 ◽

Cited By ~ 1

Author(s):

Adam Burns ◽

David Robert Bruce ◽

Pauline Robbe ◽

Adele Timbs ◽

Basile Stamatopoulos ◽

...

Keyword(s):

Error Correction ◽

Low Cost ◽

Nanopore Sequencing ◽

Sequencing Data ◽

Mutation Status ◽

Short Read ◽

Short Read Sequencing ◽

Oxford Nanopore ◽

Low Coverage ◽

Oxford Nanopore Technologies

Abstract Introduction Chronic Lymphocytic Leukaemia (CLL) is the most prevalent leukaemia in the Western world and characterised by clinical heterogeneity. IgHV mutation status, mutations in the TP53 gene and deletions of the p-arm of chromosome 17 are currently used to predict an individual patient's response to therapy and give an indication as to their long-term prognosis. Current clinical guidelines recommend screening patients prior to initial, and any subsequent, treatment. Routine clinical laboratory practices for CLL involve three separate assays, each of which are time-consuming and require significant investment in equipment. Nanopore sequencing offers a rapid, low-cost alternative, generating a full prognostic dataset on a single platform. In addition, Nanopore sequencing also promises low failure rates on degraded material such as FFPE and excellent detection of structural variants due to long read length of sequencing. Importantly, Nanopore technology does not require expensive equipment, is low-maintenance and ideal for patient-near testing, making it an attractive DNA sequencing device for low-to-middle-income countries. Methods Eleven untreated CLL samples were selected for the analysis, harbouring both mutated (n=5) and unmutated (n=6) IgHV genes, seven TP53 mutations (five missense, one stop gain and one frameshift) and two del(17p) events. Primers were designed to amplify all exons of TP53, along with the IgHV locus, and each primer included universal tails for individual sample barcoding. The resulting PCR amplicons were prepared for sequencing using a ligation sequencing kit (SQK-LSK108, Oxford Nanopore Technologies, Oxford, UK). All IgHV libraries were pooled and sequenced on one R9.4 flowcell, with the TP53 libraries pooled and sequenced on a second R9.4 flowcell. Whole genome libraries were prepared from 400ng genomic DNA for each sample using a rapid sequencing kit (SQK-RAD004, Oxford Nanopore Technologies, Oxford, UK), and each sample sequenced on individual flowcells on a MinION mk1b instrument (Oxford Nanopore Technologies, Oxford, UK). We developed a bespoke bioinformatics pipeline to detect copy-number changes, TP53 mutations and IgHV mutation status from the Nanopore sequencing data. Results were compared to short-read sequencing data obtained earlier by targeted deep sequencing (MiSeq, Illumina Inc, San Diego, CA, USA) and whole genome sequencing (HiSeq 2500, Illumina Inc, San Diego CA, USA). Results Following basecalling and adaptor trimming, the raw data were submitted to the IMGT database. In the absence of error correction, it was possible to identify the correct VH family for each sample; however the germline homology was not sufficient to differentiate between IgHVmut and IgHVunmut CLL cases. Following bio-informatic error correction and consensus building, the percentage to germline homology was the same as that obtained from short-read sequencing and nanopore sequencing also called the same productive rearrangements in all cases. A total of 77 TP53 variants were identified, including 68 in non-coding regions, and three synonymous SNVs. The remaining 6 were predicted to be functional variants (eight missense and two stop-gains) and had all been identified in early MiSeq targeted sequencing. However, the frameshift mutation was not called by the analysis pipeline, although it is present in the aligned reads. Using the low-coverage WGS data, we were able to identify del(17p) events, of 19Mb and 20Mb length, in both patients with high confidence. Conclusions Here we demonstrate that characterization of the IgHV locus in CLL cases is possible using the MinION platform, provided sufficient downstream analysis, including error correction, is applied. Furthermore, somatic SNVs in TP53 can be identified, although similar to second generation sequencing, variant calling of small insertions and deletions is more problematic. Identification of del(17p) is possible from low-coverage WGS on the MinION and is inexpensive. Our data demonstrates that Nanopore sequencing can be a viable, patient-near, low-cost alternative to established screening methods, with the potential of diagnostic implementation in resource-poor regions of the world. Disclosures Schuh: Giles, Roche, Janssen, AbbVie: Honoraria.

Download Full-text

Complete Circular Genome Sequences of Brachyspira hyodysenteriae Isolates of the Four Different Sequence Types Causing Swine Dysentery in Switzerland

Microbiology Resource Announcements ◽

10.1128/mra.00847-21 ◽

2021 ◽

Vol 10 (39) ◽

Author(s):

Ana B. García-Martín ◽

Sarah Schmitt ◽

Friederike Zeeh ◽

Vincent Perreten

Keyword(s):

High Throughput Sequencing ◽

De Novo ◽

Hybrid Assembly ◽

Swine Dysentery ◽

Content Type ◽

Brachyspira Hyodysenteriae ◽

Oxford Nanopore ◽

Sequencing Platforms ◽

Sequence Types ◽

Oxford Nanopore Technologies

The complete genomes of four Brachyspira hyodysenteriae isolates of the four different sequence types (STs) (ST6, ST66, ST196, and ST197) causing swine dysentery in Switzerland were generated by whole-genome sequencing and de novo hybrid assembly of reads obtained from second (Illumina) and third (Oxford Nanopore Technologies and Pacific Biosciences) high-throughput sequencing platforms.

Download Full-text

NanoR: a user-friendly R package to analyze and compare nanopore sequencing data

10.1101/514232 ◽

2019 ◽

Author(s):

Davide Bolognini ◽

Niccolò Bartalucci ◽

Alessandra Mingrino ◽

Alessandro Maria Vannucchi ◽

Alberto Magi

Keyword(s):

Real Time ◽

Low Cost ◽

R Package ◽

Sequencing Data ◽

High Performing ◽

Dna And Rna ◽

Oxford Nanopore ◽

The One ◽

User Friendly ◽

Oxford Nanopore Technologies

AbstractMinION and GridION X5 from Oxford Nanopore Technologies are devices for real-time DNA and RNA sequencing. On the one hand, MinION is the only real-time, low cost and portable sequencing device and, thanks to its unique properties, is becoming more and more popular among biologists; on the other, GridION X5, mainly for its costs, is less widespread but highly suitable for researchers with large sequencing projects. Despite the fact that Oxford Nanopore Technologies’ devices have been increasingly used in the last few years, there is a lack of high-performing and user-friendly tools to handle the data outputted by both MinION and GridION X5 platforms. Here we present NanoR, a cross-platform R package designed with the purpose to simplify and improve nanopore data visualization. Indeed, NanoR is built on few functions but overcomes the capabilities of existing tools to extract meaningful informations from MinION sequencing data; in addition, as exclusive features, NanoR can deal with GridION X5 sequencing outputs and allows comparison of both MinION and GridION X5 sequencing data in one command. NanoR is released as free package for R at https://github.com/davidebolo1993/NanoR.

Download Full-text