Next-generation sequencing and systematics: What can a billion base pairs of DNA sequence data do for you?

Taxon ◽  
2011 ◽  
Vol 60 (6) ◽  
pp. 1552-1566 ◽  
Author(s):  
Nicola Harrison ◽  
Catherine Anne Kidner
Zootaxa ◽  
2008 ◽  
Vol 1807 (1) ◽  
pp. 26 ◽  
Author(s):  
DAVID S. McLEOD

A new species of the dicroglossine genus Limnonectes from eastern Thailand and its tadpole are described. Analysis of DNA sequence data from 2518 base-pairs of the mitochondrial 12S and 16S gene regions places the species within the complex of frogs currently referred to as Limnonectes kuhlii and demonstrates it to be a separate lineage (>18% sequence divergence from type-material of L. kuhlii from Java). The new species differs from L. kuhlii by having nuptial pads, a greater snout–vent length, and different relative finger lengths than specimens from Java. It has more extensive toe webbing, a different arrangement of nuptial pads, and a greater snout–vent length than Limnonectes laticeps. The new species, which lacks vocal slits, also can be distinguished from the morphologically similar Limnonectes namiyei from Japan, which possesses vocal slits.


2020 ◽  
Vol 79 (2) ◽  
pp. 105-113
Author(s):  
Abdul Bari Muneera Parveen ◽  
Divya Lakshmanan ◽  
Modhumita Ghosh Dasgupta

The advent of next-generation sequencing has facilitated large-scale discovery and mapping of genomic variants for high-throughput genotyping. Several research groups working in tree species are presently employing next generation sequencing (NGS) platforms for marker discovery, since it is a cost effective and time saving strategy. However, most trees lack a chromosome level genome map and validation of variants for downstream application becomes obligatory. The cost associated with identifying potential variants from the enormous amount of sequence data is a major limitation. In the present study, high resolution melting (HRM) analysis was optimized for rapid validation of single nucleotide polymorphisms (SNPs), insertions or deletions (InDels) and simple sequence repeats (SSRs) predicted from exome sequencing of parents and hybrids of Eucalyptus tereticornis Sm. ? Eucalyptus grandis Hill ex Maiden generated from controlled hybridization. The cost per data point was less than 0.5 USD, providing great flexibility in terms of cost and sensitivity, when compared to other validation methods. The sensitivity of this technology in variant detection can be extended to other applications including Bar-HRM for species authentication and TILLING for detection of mutants.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Suguru Takeuchi ◽  
Jun-ichi Kawada ◽  
Kazuhiro Horiba ◽  
Yusuke Okuno ◽  
Toshihiko Okumura ◽  
...  

Abstract Next-generation sequencing (NGS) has been applied in the field of infectious diseases. Bronchoalveolar lavage fluid (BALF) is considered a sterile type of specimen that is suitable for detecting pathogens of respiratory infections. The aim of this study was to comprehensively identify causative pathogens using NGS in BALF samples from immunocompetent pediatric patients with respiratory failure. Ten patients hospitalized with respiratory failure were included. BALF samples obtained in the acute phase were used to prepare DNA- and RNA-sequencing libraries. The libraries were sequenced on MiSeq, and the sequence data were analyzed using metagenome analysis tools. A mean of 2,041,216 total reads were sequenced for each library. Significant bacterial or viral sequencing reads were detected in eight of the 10 patients. Furthermore, candidate pathogens were detected in three patients in whom etiologic agents were not identified by conventional methods. The complete genome of enterovirus D68 was identified in two patients, and phylogenetic analysis suggested that both strains belong to subclade B3, which is an epidemic strain that has spread worldwide in recent years. Our results suggest that NGS can be applied for comprehensive molecular diagnostics as well as surveillance of pathogens in BALF from patients with respiratory infection.


2010 ◽  
Vol 76 (12) ◽  
pp. 3863-3868 ◽  
Author(s):  
J. Kirk Harris ◽  
Jason W. Sahl ◽  
Todd A. Castoe ◽  
Brandie D. Wagner ◽  
David D. Pollock ◽  
...  

ABSTRACT Constructing mixtures of tagged or bar-coded DNAs for sequencing is an important requirement for the efficient use of next-generation sequencers in applications where limited sequence data are required per sample. There are many applications in which next-generation sequencing can be used effectively to sequence large mixed samples; an example is the characterization of microbial communities where ≤1,000 sequences per samples are adequate to address research questions. Thus, it is possible to examine hundreds to thousands of samples per run on massively parallel next-generation sequencers. However, the cost savings for efficient utilization of sequence capacity is realized only if the production and management costs associated with construction of multiplex pools are also scalable. One critical step in multiplex pool construction is the normalization process, whereby equimolar amounts of each amplicon are mixed. Here we compare three approaches (spectroscopy, size-restricted spectroscopy, and quantitative binding) for normalization of large, multiplex amplicon pools for performance and efficiency. We found that the quantitative binding approach was superior and represents an efficient scalable process for construction of very large, multiplex pools with hundreds and perhaps thousands of individual amplicons included. We demonstrate the increased sequence diversity identified with higher throughput. Massively parallel sequencing can dramatically accelerate microbial ecology studies by allowing appropriate replication of sequence acquisition to account for temporal and spatial variations. Further, population studies to examine genetic variation, which require even lower levels of sequencing, should be possible where thousands of individual bar-coded amplicons are examined in parallel.


2015 ◽  
Vol 33 (15_suppl) ◽  
pp. e12521-e12521
Author(s):  
Jessica Ribeiro Gomes ◽  
Raphael Brandao Moreira ◽  
Renata D'Alpino D'Alpino ◽  
Marcelo Rocha S Cruz ◽  
Tercia Tarciane Soares de Sousa ◽  
...  

2016 ◽  
Author(s):  
Paolo Devanna ◽  
Xiaowei Sylvia Chen ◽  
Joses Ho ◽  
Dario Gajewski ◽  
Alessandro Gialluisi ◽  
...  

ABSTRACTNext generation sequencing has opened the way for the large scale interrogation of cohorts at the whole exome, or whole genome level. Currently, the field largely focuses on potential disease causing variants that fall within coding sequences and that are predicted to cause protein sequence changes, generally discarding non-coding variants. However non-coding DNA makes up ~98% of the genome and contains a range of sequences essential for controlling the expression of protein coding genes. Thus, potentially causative non-coding variation is currently being overlooked. To address this, we have designed an approach to assess variation in one class of non-coding regulatory DNA; the 3′UTRome. Variants in the 3'UTR region of genes are of particular interest because 3'UTRs are responsible for modulating protein expression levels via their interactions with microRNAs. Furthermore they are amenable to large scale analysis as 3′UTR-microRNA interactions are based on complementary base pairing and as such can be predicted in silico at the genome-wide level. We report a strategy for identifying and functionally testing variants in microRNA binding sites within the 3'UTRome and demonstrate the efficacy of this pipeline in a cohort of language impaired children. Using whole exome sequence data from 43 probands, we extracted variants that lay within 3'UTR microRNA binding sites. We identified a common variant (SNP) in a microRNA binding site and found this SNP to be associated with an endophenotype of language impairment (non-word repetition). We showed that this variant disrupted microRNA regulation in cells and was linked to altered gene expression in the brain, suggesting it may represent a risk factor contributing to SLI. This work demonstrates that biologically relevant variants are currently being under-investigated despite the wealth of next-generation sequencing data available and presents a simple strategy for interrogating non-coding regions of the genome. We propose that this strategy should be routinely applied to whole exome and whole genome sequence data in order to broaden our understanding of how non-coding genetic variation underlies complex phenotypes such as neurodevelopmental disorders.


2021 ◽  
Author(s):  
Jean-Pierre Kocher ◽  
Zachary Stephens ◽  
Daniel O'Brien ◽  
Mrunal Dehankar ◽  
Lewis Roberts ◽  
...  

The integration of viruses into the human genome is known to be associated with tumorigenesis in many cancers, but the accurate detection of integration breakpoints from short read sequencing data is made difficult by human-viral homologies, viral genome heterogeneity, coverage limitations, and other factors. To address this, we present Exogene, a sensitive and efficient workflow for detecting viral integrations from paired-end next generation sequencing data. Exogene's read filtering and breakpoint detection strategies yield integration coordinates that are highly concordant with those found in long read validation sets. We demonstrate this concordance across 6 TCGA Hepatocellular carcinoma (HCC) tumor samples, identifying integrations of hepatitis B virus that are validated by long reads. Additionally, we applied Exogene to targeted capture data from 426 previously studied HCC samples, achieving 98.9% concordance with existing methods and identifying 238 high-confidence integrations that were not previously reported. Exogene is applicable to multiple types of paired-end sequence data, including genome, exome, RNA-Seq or targeted capture.


2018 ◽  
Vol 46 (4) ◽  
pp. 931-936 ◽  
Author(s):  
José P. Faria ◽  
Miguel Rocha ◽  
Isabel Rocha ◽  
Christopher S. Henry

In the era of next-generation sequencing and ubiquitous assembly and binning of metagenomes, new putative genome sequences are being produced from isolate and microbiome samples at ever-increasing rates. Genome-scale metabolic models have enormous utility for supporting the analysis and predictive characterization of these genomes based on sequence data. As a result, tools for rapid automated reconstruction of metabolic models are becoming critically important for supporting the analysis of new genome sequences. Many tools and algorithms have now emerged to support rapid model reconstruction and analysis. Here, we are comparing and contrasting the capabilities and output of a variety of these tools, including ModelSEED, Raven Toolbox, PathwayTools, SuBliMinal Toolbox and merlin.


2013 ◽  
Vol 2013 ◽  
pp. 1-8 ◽  
Author(s):  
O. M. Vanakker ◽  
A. De Paepe

Pharmacogenetics is considered as a prime example of how personalized medicine nowadays can be put into practice. However, genotyping to guide pharmacological treatment is relatively uncommon in the routine clinical practice. Several reasons can be found why the application of pharmacogenetics is less than initially anticipated, which include the contradictory results obtained for certain variants and the lack of guidelines for clinical implementation. However, more reproducible results are being generated, and efforts have been made to establish working groups focussing on evidence-based clinical guidelines. For another pharmacogenetic hurdle, the speed by which a pharmacogenetic profile for a certain drug can be obtained in an individual patient, there has been a revolution in molecular genetics through the introduction of next generation sequencing (NGS), making it possible to sequence a large number of genes up to the complete genome in a single reaction. Besides the enthusiasm due to the tremendous increase of our sequencing capacities, several considerations need to be made regarding quality and interpretation of the sequence data as well as ethical aspects of this technology. This paper will focus on the different NGS applications that may be useful for pharmacogenomics in children and the challenges that they bring on.


Sign in / Sign up

Export Citation Format

Share Document