Phylogenomic inferences from reference-mapped and de novo assembled short-read sequence data using RADseq sequencing of California white oaks (Quercus section Quercus)

Genome ◽  
2017 ◽  
Vol 60 (9) ◽  
pp. 743-755 ◽  
Author(s):  
Sorel Fitz-Gibbon ◽  
Andrew L. Hipp ◽  
Kasey K. Pham ◽  
Paul S. Manos ◽  
Victoria L. Sork

The emergence of next generation sequencing has increased by several orders of magnitude the amount of data available for phylogenetics. Reduced representation approaches, such as restriction-sited associated DNA sequencing (RADseq), have proven useful for phylogenetic studies of non-model species at a wide range of phylogenetic depths. However, analysis of these datasets is not uniform and we know little about the potential benefits and drawbacks of de novo assembly versus assembly by mapping to a reference genome. Using RADseq data for 83 oak samples representing 16 taxa, we identified variants via three pipelines: mapping sequence reads to a recently published draft genome of Quercus lobata, and de novo assembly under two sets of locus filters. For each pipeline, we inferred the maximum likelihood phylogeny. All pipelines produced similar trees, with minor shifts in relationships within well-supported clades, despite the fact that they yielded different numbers of loci (68 000 – 111 000 loci) and different degrees of overlap with the reference genome. We conclude that both the reference-aligned and de novo assembly pipelines yield reliable results, and that advantages and disadvantages of these approaches pertain mainly to downstream uses of RADseq data, not to phylogenetic inference per se.

2011 ◽  
Vol 56 (3) ◽  
pp. 1539-1547 ◽  
Author(s):  
Stephanie Sandiford ◽  
Mathew Upton

ABSTRACTWe describe the discovery, purification, characterization, and expression of an antimicrobial peptide, epidermicin NI01, which is an unmodified bacteriocin produced byStaphylococcus epidermidisstrain 224. It is a highly cationic, hydrophobic, plasmid-encoded peptide that exhibits potent antimicrobial activity toward a wide range of pathogenic Gram-positive bacteria including methicillin-resistantStaphylococcus aureus(MRSA), enterococci, and biofilm-formingS. epidermidisstrains. Purification of the peptide was achieved using a combination of hydrophobic interaction, cation exchange, and high-performance liquid chromatography (HPLC). Matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) analysis yielded a molecular mass of 6,074 Da, and partial sequence data of the peptide were elucidated using a combination of tandem mass spectrometry (MS/MS) andde novosequencing. The draft genome sequence of the producing strain was obtained using 454 pyrosequencing technology, thus enabling the identification of the structural gene using thede novopeptide sequence data previously obtained. Epidermicin NI01 contains 51 residues with four tryptophan and nine lysine residues, and the sequence showed approximately 50% identity to peptides lacticin Z, lacticin Q, and aureocin A53, all of which belong to a new family of unmodified type II-like bacteriocins. The peptide is active in the nanomolar range againstS. epidermidis, MRSA isolates, and vancomycin-resistant enterococci. Other unique features displayed by epidermicin include a high degree of protease stability and the ability to retain antimicrobial activity over a pH range of 2 to 10, and exposure to the peptide does not result in development of resistance in susceptible isolates. In this study we also show the structural gene alone can be cloned intoEscherichia colistrain BL21(DE3), and expression yields active peptide.


2008 ◽  
Vol 19 (2) ◽  
pp. 294-305 ◽  
Author(s):  
J. A. Reinhardt ◽  
D. A. Baltrus ◽  
M. T. Nishimura ◽  
W. R. Jeck ◽  
C. D. Jones ◽  
...  

Author(s):  
Guangtu Gao ◽  
Susana Magadan ◽  
Geoffrey C Waldbieser ◽  
Ramey C Youngblood ◽  
Paul A Wheeler ◽  
...  

Abstract Currently, there is still a need to improve the contiguity of the rainbow trout reference genome and to use multiple genetic backgrounds that will represent the genetic diversity of this species. The Arlee doubled haploid line was originated from a domesticated hatchery strain that was originally collected from the northern California coast. The Canu pipeline was used to generate the Arlee line genome de-novo assembly from high coverage PacBio long-reads sequence data. The assembly was further improved with Bionano optical maps and Hi-C proximity ligation sequence data to generate 32 major scaffolds corresponding to the karyotype of the Arlee line (2 N = 64). It is composed of 938 scaffolds with N50 of 39.16 Mb and a total length of 2.33 Gb, of which ∼95% was in 32 chromosome sequences with only 438 gaps between contigs and scaffolds. In rainbow trout the haploid chromosome number can vary from 29 to 32. In the Arlee karyotype the haploid chromosome number is 32 because chromosomes Omy04, 14 and 25 are divided into six acrocentric chromosomes. Additional structural variations that were identified in the Arlee genome included the major inversions on chromosomes Omy05 and Omy20 and additional 15 smaller inversions that will require further validation. This is also the first rainbow trout genome assembly that includes a scaffold with the sex-determination gene (sdY) in the chromosome Y sequence. The utility of this genome assembly is demonstrated through the improved annotation of the duplicated genome loci that harbor the IGH genes on chromosomes Omy12 and Omy13.


2020 ◽  
Author(s):  
Brendan N. Reid ◽  
Rachel L. Moran ◽  
Christopher J. Kopack ◽  
Sarah W. Fitzpatrick

AbstractResearchers studying non-model organisms have an increasing number of methods available for generating genomic data. However, the applicability of different methods across species, as well as the effect of reference genome choice on population genomic inference, are still difficult to predict in many cases. We evaluated the impact of data type (whole-genome vs. reduced representation) and reference genome choice on data quality and on population genomic and phylogenomic inference across several species of darters (subfamily Etheostomatinae), a highly diverse radiation of freshwater fish. We generated a high-quality reference genome and developed a hybrid RADseq/sequence capture (Rapture) protocol for the Arkansas darter (Etheostoma cragini). Rapture data from 1900 individuals spanning four darter species showed recovery of most loci across darter species at high depth and consistent estimates of heterozygosity regardless of reference genome choice. Loci with baits spanning both sides of the restriction enzyme cut site performed especially well across species. For low-coverage whole-genome data, choice of reference genome affected read depth and inferred heterozygosity. For similar amounts of sequence data, Rapture performed better at identifying fine-scale genetic structure compared to whole-genome sequencing. Rapture loci also recovered an accurate phylogeny for the study species and demonstrated high phylogenetic informativeness across the evolutionary history of the genus Etheostoma. Low cost and high cross-species effectiveness regardless of reference genome suggest that Rapture and similar sequence capture methods may be worthwhile choices for studies of diverse species radiations.


2019 ◽  
Vol 11 (7) ◽  
pp. 1965-1970 ◽  
Author(s):  
Nikola Palevich ◽  
Paul H Maclean ◽  
Abdul Baten ◽  
Richard W Scott ◽  
David M Leathwick

Abstract Internal parasitic nematodes are a global animal health issue causing drastic losses in livestock. Here, we report a H. contortus representative draft genome to serve as a genetic resource to the scientific community and support future experimental research of molecular mechanisms in related parasites. A de novo hybrid assembly was generated from PCR-free whole genome sequence data, resulting in a chromosome-level assembly that is 465 Mb in size encoding 22,341 genes. The genome sequence presented here is consistent with the genome architecture of the existing Haemonchus species and is a valuable resource for future studies regarding population genetic structures of parasitic nematodes. Additionally, comparative pan-genomics with other species of economically important parasitic nematodes have revealed highly open genomes and strong collinearities within the phylum Nematoda.


GigaScience ◽  
2020 ◽  
Vol 9 (3) ◽  
Author(s):  
Benjamin D Rosen ◽  
Derek M Bickhart ◽  
Robert D Schnabel ◽  
Sergey Koren ◽  
Christine G Elsik ◽  
...  

Abstract Background Major advances in selection progress for cattle have been made following the introduction of genomic tools over the past 10–12 years. These tools depend upon the Bos taurus reference genome (UMD3.1.1), which was created using now-outdated technologies and is hindered by a variety of deficiencies and inaccuracies. Results We present the new reference genome for cattle, ARS-UCD1.2, based on the same animal as the original to facilitate transfer and interpretation of results obtained from the earlier version, but applying a combination of modern technologies in a de novo assembly to increase continuity, accuracy, and completeness. The assembly includes 2.7 Gb and is >250× more continuous than the original assembly, with contig N50 >25 Mb and L50 of 32. We also greatly expanded supporting RNA-based data for annotation that identifies 30,396 total genes (21,039 protein coding). The new reference assembly is accessible in annotated form for public use. Conclusions We demonstrate that improved continuity of assembled sequence warrants the adoption of ARS-UCD1.2 as the new cattle reference genome and that increased assembly accuracy will benefit future research on this species.


2015 ◽  
Author(s):  
Jane Hawkey ◽  
Mohammad Hamidian ◽  
Ryan R Wick ◽  
David J Edwards ◽  
Helen Billman-Jacobe ◽  
...  

Background Insertion sequences (IS) are small transposable elements, commonly found in bacterial genomes. Identifying the location of IS in bacterial genomes can be useful for a variety of purposes including epidemiological tracking and predicting antibiotic resistance. However IS are commonly present in multiple copies in a single genome, which complicates genome assembly and the identification of IS insertion sites. Here we present ISMapper, a mapping-based tool for identification of the site and orientation of IS insertions in bacterial genomes, direct from paired-end short read data. Results ISMapper was validated using three types of short read data: (i) simulated reads from a variety of species, (ii) Illumina reads from 5 isolates for which finished genome sequences were available for comparison, and (iii) Illumina reads from 7 Acinetobacter baumannii isolates for which predicted IS locations were tested using PCR. A total of 20 genomes, including 13 species and 32 distinct IS, were used for validation. ISMapper correctly identified 96% of known IS insertions in the analysis of simulated reads, and 98% in real Illumina reads. Subsampling of real Illumina reads to lower depths indicated ISMapper was reliable for average genome-wide read depths >20x. All ISAba1 insertions identified by ISMapper in the A. baumannii genomes were confirmed by PCR. In each A. baumannii genome, ISMapper successfully identified an IS insertion upstream of the ampC beta-lactamase that could explain phenotypic resistance to third-generation cephalosporins. The utility of ISMapper was further demonstrated by profiling genome-wide IS6110 insertions in 138 publicly available Mycobacterium tuberculosis genomes, revealing lineage-specific insertions and multiple insertion hotspots. Conclusions ISMapper provides a rapid and robust method for identifying IS insertion sites direct from short read data, with a high degree of accuracy demonstrated across a wide range of bacteria.


F1000Research ◽  
2018 ◽  
Vol 7 ◽  
pp. 297 ◽  
Author(s):  
Jason R. Miller ◽  
Sergey Koren ◽  
Kari A. Dilley ◽  
Derek M. Harkins ◽  
Timothy B. Stockwell ◽  
...  

Background:The tick cell line ISE6, derived fromIxodes scapularis, is commonly used for amplification and detection of arboviruses in environmental or clinical samples.Methods:To assist with sequence-based assays, we sequenced the ISE6 genome with single-molecule, long-read technology.Results:The draft assembly appears near complete based on gene content analysis, though it appears to lack some instances of repeats in this highly repetitive genome. The assembly appears to have separated the haplotypes at many loci. DNA short read pairs, used for validation only, mapped to the cell line assembly at a higher rate than they mapped to theIxodes scapularisreference genome sequence.Conclusions:The assembly could be useful for filtering host genome sequence from sequence data obtained from cells infected with pathogens.


2022 ◽  
Author(s):  
Shinichi Morita ◽  
Tomoko F. Shibata ◽  
Tomoaki Nishiyama ◽  
Yuuki Kobayashi ◽  
Katsushi Yamaguchi ◽  
...  

Beetles are the largest insect order and one of the most successful animal groups in terms of number of species. The Japanese rhinoceros beetle Trypoxylus dichotomus (Coleoptera, Scarabaeidae, Dynastini) is a giant beetle with distinctive exaggerated horns present on the head and prothoracic regions of the male. T. dichotomus has been used as research model in various fields such as evolutionary developmental biology, ecology, ethology, biomimetics, and drug discovery. In this study, de novo assembly of 615 Mb, representing 80% of the genome estimated by flow cytometry, was obtained using the 10x Chromium platform. The scaffold N50 length of the genome assembly was 8.02 Mb, with repetitive elements predicted to comprise 49.5% of the assembly. In total, 23,987 protein-coding genes were predicted in the genome. In addition, de novo assembly of the mitochondrial genome yielded a contig of 20,217 bp. We also analyzed the transcriptome by generating 16 RNA-seq libraries from a variety of tissues of both sexes and developmental stages, which allowed us to identify 13 co-expressed gene modules. The detailed genomic and transcriptomic information of T. dichotomus is the most comprehensive among those reported for any species of Dynastinae. This genomic information will be an excellent resource for further functional and evolutionary analyses, including the evolutionary origin and genetic regulation of beetle horns and the molecular mechanisms underlying sexual dimorphism.


Sign in / Sign up

Export Citation Format

Share Document