scholarly journals Cost-efficient high throughput capture of museum arthropod specimen DNA using PCR-generated baits

2018 ◽  
Author(s):  
Alexander Knyshov ◽  
Eric R.L. Gordon ◽  
Christiane Weirauch

AbstractGathering genetic data for rare species is one of the biggest remaining obstacles in modern phylogenetics, particularly for megadiverse groups such as arthropods. Next generation sequencing techniques allow for sequencing of short DNA fragments contained in preserved specimens >20 years old, but approaches such as whole genome sequencing are often too expensive for projects including many taxa. Several methods of reduced representation sequencing have been proposed that lower the cost of sequencing per specimen, but many remain costly because they involve synthesizing nucleotide probes and target hundreds of loci. These datasets are also frequently unique for each project and thus generally incompatible with other similar datasets.Here, we explore utilization of in-house generated DNA baits to capture commonly utilized mitochondrial and ribosomal DNA loci from insect museum specimens of various age and preservation types without the a priori need to know the sequence of the target loci. Both within species and cross-species capture are explored, on preserved specimens ranging in age from one to 54 years old.We found most samples produced sufficient amounts of data to assemble the nuclear ribosomal rRNA genes and near complete mitochondrial genomes and produce well-resolved phylogenies in line with expected results. The dataset obtained can be straightforwardly combined with the large cache of existing Sanger-sequencing-generated data built up over the past 30 years and targeted loci can be easily modified to those commonly used in different taxa. Furthermore, the protocol we describe allows for inexpensive data generation (as low as ∼$35/sample), of at least 20 kilobases per specimen, for specimens at least as old as ∼1965, and can be easily conducted in most laboratories.If widely applied, this technique will accelerate the accurate resolution of the Tree of Life especially on non-model organisms with limited existing genomic resources.

2020 ◽  
Author(s):  
Lucas Costa ◽  
André Marques ◽  
Chris Buddenhagen ◽  
William Wayt Thomas ◽  
Bruno Huettel ◽  
...  

SUMMARYWith the advance of high-throughput sequencing (HTS), reduced-representation methods such as target capture sequencing (TCS) emerged as cost-efficient ways of gathering genomic information. As the off-target reads from such sequencing are expected to be similar to genome skims (GS), we assessed the quality of repeat characterization using this data.For this, repeat composition from TCS datasets of five Rhynchospora (Cyperaceae) species were compared with GS data from the same taxa.All the major repetitive DNA families were identified in TCS, including repeats that showed abundances as low as 0.01% in the GS data. Rank correlation between GS and TCS repeat abundances were moderately high (r = 0.58-0.85), increasing after filtering out the targeted loci from the raw TCS reads (r = 0.66-0.92). Repeat data obtained by TCS was also reliable to develop a cytogenetic probe and solve phylogenetic relationships of Rhynchospora species with high support.In light of our results, TCS data can be effectively used for cyto- and phylogenomic investigations of repetitive DNA. Given the growing availability of HTS reads, driven by global phylogenomic projects, our strategy represents a way to recycle genomic data and contribute to a better characterization of plant biodiversity.


2021 ◽  
Author(s):  
Stephanie Szarmach ◽  
Alan Brelsford ◽  
Christopher C Witt ◽  
David Toews

Researchers seeking to generate genomic data for non-model organisms are faced with a number of trade-offs when deciding which method to use. The selection of reduced representation approaches versus whole genome re-sequencing will ultimately affect the marker density, sequencing depth, and the number of individuals that can multiplexed. These factors can affect researchers' ability to accurately characterize certain genomic features, such as landscapes of divergence-how FST varies across the genomes. To provide insight into the effect of sequencing method on the estimation of divergence landscapes, we applied an identical bioinformatic pipeline to three generations of sequencing data (GBS, ddRAD, and WGS) produced for the same system, the yellow-rumped warbler species complex. We compare divergence landscapes generated using each method for the myrtle warbler (Setophaga coronata coronata) and the Audubon's warbler (S. c. auduboni), and for Audubon's warblers with deeply divergent mtDNA resulting from mitochondrial introgression. We found that most high-FST peaks were not detected in the ddRAD dataset, and that while both GBS and WGS were able to identify the presence of large peaks, WGS was superior at a finer scale. Comparing Audubon's warblers with divergent mitochondrial haplotypes, only WGS allowed us to identify small (10-20kb) regions of elevated differentiation, one of which contained the nuclear-encoded mitochondrial gene NDUFAF3. We calculated the cost per base pair for each method and found it was comparable between GBS and WGS, but significantly higher for ddRAD. These comparisons highlight the advantages of WGS over reduced representation methods when characterizing landscapes of divergence.


2020 ◽  
Author(s):  
Brandon T. Sinn ◽  
Sandra J. Simon ◽  
Mathilda V. Santee ◽  
Stephen P. DiFazio ◽  
Nicole M. Fama ◽  
...  

ABSTRACTThe capability to generate densely sampled single nucleotide polymorphism (SNP) data is essential in diverse subdisciplines of biology, including crop breeding, pathology, forensics, forestry, ecology, evolution, and conservation. However, access to the expensive equipment and bioinformatics infrastructure required for genome-scale sequencing is still a limiting factor in the developing world and for institutions with limited resources.Here we present ISSRseq, a PCR-based method for reduced representation of genomic variation using simple sequence repeats as priming sites to sequence inter-simple sequence repeat (ISSR) regions. Briefly, ISSR regions are amplified with single primers, pooled, and used to construct sequencing libraries with a low-cost, efficient commercial kit, and sequenced on the Illumina platform. We also present a flexible bioinformatic pipeline that assembles ISSR loci, calls and hard filters variants, outputs data matrices in common formats, and conducts population analyses using R.Using three angiosperm species as case studies, we demonstrate that ISSRseq is highly repeatable, necessitates only simple wet-lab skills and commonplace instrumentation, is flexible in terms of the number of single primers used, is low-cost, and can generate genomic-scale variant discovery on par with existing RRS methods that require high sample integrity and concentration.ISSRseq represents a straightforward approach to SNP genotyping in any organism, and we predict that this method will be particularly useful for those studying population genomics and phylogeography of non-model organisms. Furthermore, the ease of ISSRseq relative to other RRS methods should prove useful for those conducting research in undergraduate and graduate environments, and more broadly by those lacking access to expensive instrumentation or expertise in bioinformatics.


2020 ◽  
Author(s):  
Steven M. Mussmann ◽  
Marlis R. Douglas ◽  
Tyler K. Chafin ◽  
Michael E. Douglas

AbstractBackgroundResearch on the molecular ecology of non-model organisms, while previously constrained, has now been greatly facilitated by the advent of reduced-representation sequencing protocols. However, tools that allow these large datasets to be efficiently parsed are often lacking, or if indeed available, then limited by the necessity of a comparable reference genome as an adjunct. This, of course, can be difficult when working with non-model organisms. Fortunately, pipelines are currently available that avoid this prerequisite, thus allowing data to be a priori parsed. An oft-used molecular ecology program (i.e., Structure), for example, is facilitated by such pipelines, yet they are surprisingly absent for a second program that is similarly popular and computationally more efficient (i.e., Admixture). The two programs differ in that Admixture employs a maximum-likelihood framework whereas Structure uses a Bayesian approach, yet both produce similar results. Given these issues, there is an overriding (and recognized) need among researchers in molecular ecology for bioinformatic software that will not only condense output from replicated Admixture runs, but also infer from these data the optimal number of population clusters (K).ResultsHere we provide such a program (i.e., AdmixPipe) that (a) filters SNPs to allow the delineation of population structure in Admixture, then (b) parses the output for summarization and graphical representation via Clumpak. Our benchmarks effectively demonstrate how efficient the pipeline is for processing large, non-model datasets generated via double digest restriction-site associated DNA sequencing (ddRAD). Outputs not only parallel those from Structure, but also visualize the variation among individual Admixture runs, so as to facilitate selection of the most appropriate K-value.ConclusionsAdmixPipe successfully integrates Admixture analysis with popular variant call format (VCF) filtering software to yield file types readily analyzed by Clumpak. Large population genomic datasets derived from non-model organisms are efficiently analyzed via the parallel-processing capabilities of Admixture. AdmixPipe is distributed under the GNU Public License and freely available for Mac OSX and Linux platforms at: https://github.com/stevemussmann/admixturePipeline.


2015 ◽  
Author(s):  
Aline Muyle ◽  
Jos Käfer ◽  
Niklaus Zemp ◽  
Sylvain Mousset ◽  
Franck Picard ◽  
...  

AbstractData deposition: During the review process, the SEX-DETector galaxy workflow and associated test datasets are made available on the public galaxy.prabi.fr server. The data as well as the tool interface are visible to anonymous users, but to use them, you should register for an account (“user Register”), and import the data library “SEX-DETector” (“Shared Data Data Libraries”) into your history. More instructions can be found in the “readme” file in this library. The user manual for SEX-DETector is available here: https://lbbe.univ-lyon1.fr/Download-5251.html?lang=en.Paper submitted as a Genome Resource.We propose a probabilistic framework to infer autosomal and sex-linked genes from RNA-seq data of a cross for any sex chromosome type (XY, ZW, UV). Sex chromosomes (especially the nonrecombining and repeat-dense Y, W, U and V) are notoriously difficult to sequence. Strategies have been developed to obtain partially assembled sex chromosome sequences. However, most of them remain difficult to apply to numerous non-model organisms, either because they require a reference genome, or because they are designed for evolutionarily old systems. Sequencing a cross (parents and progeny) by RNA-seq to study the segregation of alleles and infer sex-linked genes is a cost-efficient strategy, which also provides expression level estimates. However, the lack of a proper statistical framework has limited a broader application of this approach. Tests on empirical data show that our method identifies many more sex-linked genes than existing pipelines, while making reliable inferences for downstream analyses. Simulations suggest few individuals are needed for optimal results. For species with unknown sex-determination system, the method can assess the presence and type (XY versus ZW) of sex chromosomes through a model comparison strategy. The method is particularly well optimised for sex chomosomes of young or intermediate age, which are expected in thousands of yet unstudied lineages. Any organism, including non-model ones for which nothing is known a priori, that can be bred in the lab, is suitable for our method. SEX-DETector is made freely available to the community through a Galaxy workflow.


Author(s):  
Earley H. ◽  
Mealy K.

Abstract Introduction Postgraduate specialty training in Ireland is associated with considerable cost. Some of these are mandatory costs such as medical council fees, while others are necessary to ensure career progression, such as attendance at courses and conferences. In particular, surgical specialities are believed to be associated with high training costs. It is unknown how these costs compare to those borne by counterparts in other specialities. Aims The aims of this study were to Quantify the amount that trainees in Ireland spend on postgraduate training Determine whether a difference exists between surgery and other non-skill-based specialties in terms of expenditure on training Methods A standardised non-mandatory questionnaire was circulated to trainees across two training centres in Ireland. Trainees at all levels were invited to participate. Results Sixty responses were obtained. Fifty-seven questionnaires were fully completed and included for analysis. The median expenditure on training was higher for surgical than non-surgical specialities. Subgroup analysis revealed surgical training was associated with higher expenditure on higher degrees and courses compared to medical training (p = 0.035). > 95% of trainees surveyed felt that greater financial support should be available for trainees during the course of their training. Conclusions This study demonstrated that a career in surgery is associated with higher ongoing costs for higher degrees and courses than counterparts in non-surgical training. All surgical trainees surveyed felt that better financial support should be available. Increasing financial support for may be a tangible way to mitigate against attrition during training.


Author(s):  
Mohammad Istiak Hossain ◽  
Jan I. Markendahl

AbstractSmall-scale commercial rollouts of Cellular-IoT (C-IoT) networks have started globally since last year. However, among the plethora of low power wide area network (LPWAN) technologies, the cost-effectiveness of C-IoT is not certain for IoT service providers, small and greenfield operators. Today, there is no known public framework for the feasibility analysis of IoT communication technologies. Hence, this paper first presents a generic framework to assess the cost structure of cellular and non-cellular LPWAN technologies. Then, we applied the framework in eight deployment scenarios to analyze the prospect of LPWAN technologies like Sigfox, LoRaWAN, NB-IoT, LTE-M, and EC-GSM. We consider the inter-technology interference impact on LoRaWAN and Sigfox scalability. Our results validate that a large rollout with a single technology is not cost-efficient. Also, our analysis suggests the rollout possibility of an IoT communication Technology may not be linear to cost-efficiency.


2021 ◽  
Vol 13 (11) ◽  
pp. 6075
Author(s):  
Ola Lindroos ◽  
Malin Söderlind ◽  
Joel Jensen ◽  
Joakim Hjältén

Translocation of dead wood is a novel method for ecological compensation and restoration that could, potentially, provide a new important tool for biodiversity conservation. With this method, substrates that normally have long delivery times are instantly created in a compensation area, and ideally many of the associated dead wood dwelling organisms are translocated together with the substrates. However, to a large extent, there is a lack of knowledge about the cost efficiency of different methods of ecological compensation. Therefore, the costs for different parts of a translocation process and its dependency on some influencing factors were studied. The observed cost was 465 SEK per translocated log for the actual compensation measure, with an additional 349 SEK/log for work to enable evaluation of the translocation’s ecological results. Based on time studies, models were developed to predict required work time and costs for different transportation distances and load sizes. Those models indicated that short extraction and insertion distances for logs should be prioritized over road transportation distances to minimize costs. They also highlighted a trade-off between costs and time until a given ecological value is reached in the compensation area. The methodology used can contribute to more cost-efficient operations and, by doing so, increase the use of ecological compensation and the benefits from a given input.


2020 ◽  
Vol 15 (1) ◽  
pp. 4-17
Author(s):  
Jean-François Biasse ◽  
Xavier Bonnetain ◽  
Benjamin Pring ◽  
André Schrottenloher ◽  
William Youmans

AbstractWe propose a heuristic algorithm to solve the underlying hard problem of the CSIDH cryptosystem (and other isogeny-based cryptosystems using elliptic curves with endomorphism ring isomorphic to an imaginary quadratic order 𝒪). Let Δ = Disc(𝒪) (in CSIDH, Δ = −4p for p the security parameter). Let 0 < α < 1/2, our algorithm requires:A classical circuit of size $2^{\tilde{O}\left(\log(|\Delta|)^{1-\alpha}\right)}.$A quantum circuit of size $2^{\tilde{O}\left(\log(|\Delta|)^{\alpha}\right)}.$Polynomial classical and quantum memory.Essentially, we propose to reduce the size of the quantum circuit below the state-of-the-art complexity $2^{\tilde{O}\left(\log(|\Delta|)^{1/2}\right)}$ at the cost of increasing the classical circuit-size required. The required classical circuit remains subexponential, which is a superpolynomial improvement over the classical state-of-the-art exponential solutions to these problems. Our method requires polynomial memory, both classical and quantum.


Sign in / Sign up

Export Citation Format

Share Document