Colonization and diversification of aquatic insects on three Macaronesian archipelagos using 59 nuclear loci derived from a draft genome

First draft genome of loach (Orenectus shuilongensis; Cypriniformes: Nemacheilidae) provide insights into the evolution of cavefish

10.21203/rs.3.rs-192229/v1 ◽

2021 ◽

Author(s):

Zhijin Liu ◽

Xuekun Qian ◽

Ziming Wang ◽

Huamei Wen ◽

Ling Han ◽

...

Keyword(s):

Single Molecule ◽

Genome Assembly ◽

Eye Development ◽

Draft Genome ◽

Evolutionary Process ◽

Integrated Approach ◽

Sequencing Data ◽

Retina Development ◽

Draft Genome Assembly ◽

Surface Dwelling

Abstract BcakgroundLoaches of the superfamily Cobitoidea (Cypriniformes, Nemacheilidae) are small elongated bottom-dwelling freshwater fishes with several barbels near the mouth. The genus Oreonectes with 18 currently recognized species contains representatives for all three key stages of the evolutionary process (a surface-dwelling lifestyle, facultative cave persistence, and permanent cave dwelling). Some Oreonectes species show typical cave dwelling-related traits, such as partial or complete leucism and regression of the eyes, rendering them as suitable study objects of micro-evolution. Genome information of Oreonectes species is therefore an indispensable resource for research into the evolution of cavefishes.ResultsHere we assembled the genome sequence of O. shuilongensis, a surface-dwelling species, using an integrated approach that combined PacBio single-molecule real-time sequencing and Illumina X-ten paired-end sequencing. Based on in total 50.9 Gb of sequencing data, our genome assembly from Canu and Pilon spans approximately 515.64 Mb (estimated coverage of 100 ×), containing 803 contigs with N50 values of 5.58 Mb. 25,247 protein-coding genes were predicted, of which 95.65% have been functionally annotated. We also performed genome re-sequencing of three additional cave-dwelling Oreonectes fishes. Twenty-nine pseudogenes annotated using DAVID showed significant enrichment for the GO terms of “eye development” and “retina development in camera-type eye”. It is presumed that these pseudogenes might lead to eye degeneration of semi/complete cave-dwelling Oreonectes species. Furthermore, Mc1r (melanocortin-1 receptor) is a pseudogenization by a deletion in O. daqikongensis, likely blocking biosynthesis of melanin and leading to the albino phenotype.ConclusionsWe here report the first draft genome assembly of Oreonectes fishes, which is also the first genome reference for Cobitidea fishes. Pseudogenization of genes related to body color and eye development may be responsible for loss of pigmentation and vision deterioration in cave-dwelling species. This genome assembly will contribute to the study of the evolution and adaptation of fishes within Oreonectes and beyond (Cobitidea).

Download Full-text

Draft Genome Sequences of Ciprofloxacin-Resistant Salmonella enterica Strains with Multiple-Antibiotic Resistance, Isolated from Imported Foods

Genome Announcements ◽

10.1128/genomea.01222-17 ◽

2017 ◽

Vol 5 (45) ◽

Author(s):

Ashraf A. Khan ◽

Bijay K. Khajanchi ◽

Sana A. Khan ◽

Christopher A. Elkins ◽

Steven L. Foley

Keyword(s):

Antibiotic Resistance ◽

Salmonella Enterica ◽

Draft Genome ◽

Resistance Mechanisms ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Genome Sequences ◽

Content Type ◽

Imported Foods ◽

High Level

ABSTRACT We report here the draft genome sequences of 15 ciprofloxacin-resistant Salmonella enterica strains with resistance to multiple other antibiotics, including aminoglycosides, β-lactams, sulfonamides, tetracycline, and trimethoprim, isolated from different imported foods. Three strains (NCTR75, NCTR281, and NCTR350) showed a high level of ciprofloxacin resistance compared to that of the other isolates. The whole-genome sequencing data provide a better understanding of the antibiotic resistance mechanisms and virulence properties of these isolates.

Download Full-text

A de novo assembly of the sweet cherry (Prunus avium cv. Tieton) genome using linked-read sequencing technology

PeerJ ◽

10.7717/peerj.9114 ◽

2020 ◽

Vol 8 ◽

pp. e9114 ◽

Cited By ~ 1

Author(s):

Jiawei Wang ◽

Weizhen Liu ◽

Dongzi Zhu ◽

Xiang Zhou ◽

Po Hong ◽

...

Keyword(s):

Sweet Cherry ◽

Prunus Avium ◽

Reference Genome ◽

De Novo ◽

Draft Genome ◽

Single Copy ◽

Sequencing Data ◽

Sequencing Technology ◽

High Quality ◽

Eukaryotic Genes

The sweet cherry (Prunus avium) is one of the most economically important fruit species in the world. However, there is a limited amount of genetic information available for this species, which hinders breeding efforts at a molecular level. We were able to describe a high-quality reference genome assembly and annotation of the diploid sweet cherry (2n = 2x = 16) cv. Tieton using linked-read sequencing technology. We generated over 750 million clean reads, representing 112.63 GB of raw sequencing data. The Supernova assembler produced a more highly-ordered and continuous genome sequence than the current P. avium draft genome, with a contig N50 of 63.65 KB and a scaffold N50 of 2.48 MB. The final scaffold assembly was 280.33 MB in length, representing 82.12% of the estimated Tieton genome. Eight chromosome-scale pseudomolecules were constructed, completing a 214 MB sequence of the final scaffold assembly. De novo, homology-based, and RNA-seq methods were used together to predict 30,975 protein-coding loci. 98.39% of core eukaryotic genes and 97.43% of single copy orthologues were identified in the embryo plant, indicating the completeness of the assembly. Linked-read sequencing technology was effective in constructing a high-quality reference genome of the sweet cherry, which will benefit the molecular breeding and cultivar identification in this species.

Download Full-text

Phylogenomic approaches to detecting and characterizing introgression

Genetics ◽

10.1093/genetics/iyab173 ◽

2021 ◽

Author(s):

Mark S Hibbins ◽

Matthew W Hahn

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Future Development ◽

Rapid Growth ◽

Phylogenetic Networks ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Multispecies Coalescent ◽

Use Of Data

Abstract Phylogenomics has revealed the remarkable frequency with which introgression occurs across the tree of life. These discoveries have been enabled by the rapid growth of methods designed to detect and characterize introgression from whole-genome sequencing data. A large class of phylogenomic methods makes use of data across species to infer and characterize introgression based on expectations from the multispecies coalescent. These methods range from simple tests, such as the D-statistic, to model-based approaches for inferring phylogenetic networks. Here, we provide a detailed overview of the various signals that different modes of introgression are expected leave in the genome, and how current methods are designed to detect them. We discuss the strengths and pitfalls of these approaches and identify areas for future development, highlighting the different signals of introgression, and the power of each method to detect them. We conclude with a discussion of current challenges in inferring introgression and how they could potentially be addressed.

Download Full-text

Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies

PeerJ ◽

10.7717/peerj.1839 ◽

2016 ◽

Vol 4 ◽

pp. e1839 ◽

Cited By ~ 57

Author(s):

Tom O. Delmont ◽

A. Murat Eren

Keyword(s):

High Throughput Sequencing ◽

Draft Genome ◽

Cost Effective ◽

Single Copy ◽

Eukaryotic Genome ◽

Sequencing Data ◽

Bacterial Genomes ◽

Long Read ◽

Domains Of Life ◽

Genome Assemblies

High-throughput sequencing provides a fast and cost-effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results against potential contamination of non-target organisms requires advanced bioinformatics approaches and practices. Here, we re-analyzed the sequencing data generated for the tardigradeHypsibius dujardini,and created a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries. By using bacterial single-copy genes, k-mer frequencies, and coverage values of scaffolds we could identify and characterize multiple near-complete bacterial genomes from the raw assembly, and curate a 182 Mbp draft genome forH. dujardinisupported by RNA-Seq data. Our results indicate that most contaminant scaffolds were assembled from Moleculo long-read libraries, and most of these contaminants have differed between library preparations. Our re-analysis shows that visualization and curation of eukaryotic genome assemblies can benefit from tools designed to address the needs of today’s microbiologists, who are constantly challenged by the difficulties associated with the identification of distinct microbial genomes in complex environmental metagenomes.

Download Full-text

Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American Fundulus killifish

GigaScience ◽

10.1093/gigascience/giaa067 ◽

2020 ◽

Vol 9 (6) ◽

Cited By ~ 3

Author(s):

Lisa K Johnson ◽

Ruta Sahasrabudhe ◽

James Anthony Gill ◽

Jennifer L Roach ◽

Lutz Froenicke ◽

...

Keyword(s):

North American ◽

De Novo ◽

Draft Genome ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Sequence Coverage ◽

Short Read ◽

Oxford Nanopore ◽

Long Read ◽

Genome Assemblies

Abstract Background Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. Findings Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30–45× sequence coverage, and the Illumina platform was used to generate 50–160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. Conclusions High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.

Download Full-text

Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies

10.7287/peerj.preprints.1695v1 ◽

2016 ◽

Cited By ~ 1

Author(s):

Tom O Delmont ◽

A. Murat Eren

Keyword(s):

High Throughput Sequencing ◽

Draft Genome ◽

Cost Effective ◽

Single Copy ◽

Eukaryotic Genome ◽

Metagenomic Data ◽

Sequencing Data ◽

Long Read ◽

Domains Of Life ◽

Genome Assemblies

High-throughput sequencing provides a fast and cost effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results against potential contamination of non-target organisms requires advanced bioinformatics approaches and practices. Here, we re-analyzed the sequencing data generated for the tardigrade Hypsibius dujardini using approaches routinely employed by microbial ecologists who reconstruct bacterial and archaeal genomes from metagenomic data. We created a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries. By using bacterial single-copy genes, k-mer frequencies, and coverage values of scaffolds we could identify and characterize multiple near-complete bacterial genomes, and curate a 182 Mbp draft genome for H. dujardini supported by RNA-Seq data. Our results indicate that most contaminant scaffolds were assembled from Moleculo long-read libraries, and most of these contaminants have differed between library preparations. Our re-analysis shows that visualization and curation of eukaryotic genome assemblies can benefit from tools designed to address the needs of today’s microbiologists, who are constantly challenged by the difficulties associated with the identification of distinct microbial genomes in complex environmental metagenomes.

Download Full-text

Analyzing and Characterizing the Chloroplast Genome of Salix wilsonii

BioMed Research International ◽

10.1155/2019/5190425 ◽

2019 ◽

Vol 2019 ◽

pp. 1-14 ◽

Cited By ~ 3

Author(s):

Yingnan Chen ◽

Nan Hu ◽

Huaitong Wu

Keyword(s):

Chloroplast Genome ◽

Tandem Repeats ◽

Single Copy ◽

Rrna Genes ◽

Whole Genome Sequencing Data ◽

Trna Genes ◽

Sequencing Data ◽

Protein Coding ◽

Organelle Genomes ◽

Additional Sequence

Salix wilsonii is an important ornamental willow tree widely distributed in China. In this study, an integrated circular chloroplast genome was reconstructed for S. wilsonii based on the chloroplast reads screened from the whole-genome sequencing data generated with the PacBio RSII platform. The obtained pseudomolecule was 155,750 bp long and had a typical quadripartite structure, comprising a large single copy region (LSC, 84,638 bp) and a small single copy region (SSC, 16,282 bp) separated by two inverted repeat regions (IR, 27,415 bp). The S. wilsonii chloroplast genome encoded 115 unique genes, including four rRNA genes, 30 tRNA genes, 78 protein-coding genes, and three pseudogenes. Repetitive sequence analysis identified 32 tandem repeats, 22 forward repeats, two reverse repeats, and five palindromic repeats. Additionally, a total of 118 perfect microsatellites were detected, with mononucleotide repeats being the most common (89.83%). By comparing the S. wilsonii chloroplast genome with those of other rosid plant species, significant contractions or expansions were identified at the IR-LSC/SSC borders. Phylogenetic analysis of 17 willow species confirmed that S. wilsonii was most closely related to S. chaenomeloides and revealed the monophyly of the genus Salix. The complete S. wilsonii chloroplast genome provides an additional sequence-based resource for studying the evolution of organelle genomes in woody plants.

Download Full-text

A Genome Resource for the Apple Powdery Mildew Pathogen Podosphaera leucotricha

Phytopathology ◽

10.1094/phyto-05-20-0158-a ◽

2020 ◽

Vol 110 (11) ◽

pp. 1756-1758

Author(s):

Lederson Gañán ◽

Richard Allen White ◽

Maren L. Friesen ◽

Tobin L. Peever ◽

Achour Amiri

Keyword(s):

Powdery Mildew ◽

Population Biology ◽

Molecular Mechanisms ◽

Draft Genome ◽

Single Copy ◽

Sequencing Data ◽

Pome Fruit ◽

Podosphaera Leucotricha ◽

A Genome ◽

Powdery Mildew Pathogen

Powdery mildew, caused by Podosphaera leucotricha, is an economically important disease of apple and pear trees. A single monoconidial strain (PuE-3) of this biotrophic fungus was used to extract DNA for Illumina sequencing. Data were assembled to form a draft genome of 43.8 Mb consisting of 8,921 contigs, 9,372 predicted genes, and 96.1% of complete benchmarking universal single copy orthologs (BUSCOs). This is the first reported genome sequence of P. leucotricha that will enable studies of the population biology, epidemiology, and fungicide resistance of this pathogen. Furthermore, this resource will be fundamental to uncover the genetic and molecular mechanisms of the apple−powdery mildew interaction, and support future pome fruit breeding efforts.

Download Full-text

Identifying contamination with advanced visualization and analysis practices: metagenomic approaches for eukaryotic genome assemblies

10.7287/peerj.preprints.1695 ◽

2016 ◽

Author(s):

Tom O Delmont ◽

A. Murat Eren

Keyword(s):

High Throughput Sequencing ◽

Draft Genome ◽

Cost Effective ◽

Single Copy ◽

Eukaryotic Genome ◽

Metagenomic Data ◽

Sequencing Data ◽

Long Read ◽

Domains Of Life ◽

Genome Assemblies

High-throughput sequencing provides a fast and cost effective mean to recover genomes of organisms from all domains of life. However, adequate curation of the assembly results against potential contamination of non-target organisms requires advanced bioinformatics approaches and practices. Here, we re-analyzed the sequencing data generated for the tardigrade Hypsibius dujardini using approaches routinely employed by microbial ecologists who reconstruct bacterial and archaeal genomes from metagenomic data. We created a holistic display of the eukaryotic genome assembly using DNA data originating from two groups and eleven sequencing libraries. By using bacterial single-copy genes, k-mer frequencies, and coverage values of scaffolds we could identify and characterize multiple near-complete bacterial genomes, and curate a 182 Mbp draft genome for H. dujardini supported by RNA-Seq data. Our results indicate that most contaminant scaffolds were assembled from Moleculo long-read libraries, and most of these contaminants have differed between library preparations. Our re-analysis shows that visualization and curation of eukaryotic genome assemblies can benefit from tools designed to address the needs of today’s microbiologists, who are constantly challenged by the difficulties associated with the identification of distinct microbial genomes in complex environmental metagenomes.

Download Full-text