scholarly journals A Systematic Evaluation of High-Throughput Sequencing Approaches to Identify Low-Frequency Single Nucleotide Variants in Viral Populations

Viruses ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1187
Author(s):  
David J. King ◽  
Graham Freimanis ◽  
Lidia Lasecka-Dykes ◽  
Amin Asfor ◽  
Paolo Ribeca ◽  
...  

High-throughput sequencing such as those provided by Illumina are an efficient way to understand sequence variation within viral populations. However, challenges exist in distinguishing process-introduced error from biological variance, which significantly impacts our ability to identify sub-consensus single-nucleotide variants (SNVs). Here we have taken a systematic approach to evaluate laboratory and bioinformatic pipelines to accurately identify low-frequency SNVs in viral populations. Artificial DNA and RNA “populations” were created by introducing known SNVs at predetermined frequencies into template nucleic acid before being sequenced on an Illumina MiSeq platform. These were used to assess the effects of abundance and starting input material type, technical replicates, read length and quality, short-read aligner, and percentage frequency thresholds on the ability to accurately call variants. Analyses revealed that the abundance and type of input nucleic acid had the greatest impact on the accuracy of SNV calling as measured by a micro-averaged Matthews correlation coefficient score, with DNA and high RNA inputs (107 copies) allowing for variants to be called at a 0.2% frequency. Reduced input RNA (105 copies) required more technical replicates to maintain accuracy, while low RNA inputs (103 copies) suffered from consensus-level errors. Base errors identified at specific motifs identified in all technical replicates were also identified which can be excluded to further increase SNV calling accuracy. These findings indicate that samples with low RNA inputs should be excluded for SNV calling and reinforce the importance of optimising the technical and bioinformatics steps in pipelines that are used to accurately identify sequence variants.

F1000Research ◽  
2014 ◽  
Vol 2 ◽  
pp. 217 ◽  
Author(s):  
Guillermo Barturen ◽  
Antonio Rueda ◽  
José L. Oliver ◽  
Michael Hackenberg

Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants.We developed MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources. MethylExtract detects variation (SNVs – Single Nucleotide Variants) in a similar way to VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called Bis-SNP.MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at http://bioinfo2.ugr.es/MethylExtract/ and http://sourceforge.net/projects/methylextract/, and also permanently accessible from 10.5281/zenodo.7144.


2019 ◽  
Vol 2 (1) ◽  
pp. 15 ◽  
Author(s):  
Martin Broberg ◽  
James E. McDonald

The application of high-throughput nucleic acid and protein sequencing technologies is transforming our understanding of plant microbiomes and their interactions with their hosts in health and disease. However, progress in studying host-microbiome interactions in above-ground compartments of the tree (the phyllosphere) has been hampered due to high concentrations of phenolic compounds, lignin, and other compounds in tree bark that severely limit the success of DNA, RNA, and protein extraction. Here we present modified sample-preparation and kit-based protocols for the extraction of host and microbiome DNA and RNA from oak (Quercus robus and Quercus petraea) bark tissue for subsequent high-throughput sequencing. In addition, reducing the quantity of bark tissue used for an established protein extraction protocol yielded high quality protein for parallel analysis of the oak-microbiota metaproteome. These procedures demonstrate the successful extraction of nucleic acids and proteins from oak tissue using as little as 50 mg of sample input, producing sufficient quantities for nucleic acid sequencing and protein mass spectrometry of tree stem tissues and their associated microbiota.


F1000Research ◽  
2013 ◽  
Vol 2 ◽  
pp. 217 ◽  
Author(s):  
Guillermo Barturen ◽  
Antonio Rueda ◽  
José L. Oliver ◽  
Michael Hackenberg

Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants.We developed MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources. MethylExtract detects variation (SNVs – Single Nucleotide Variants) in a similar way to VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called Bis-SNP.MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at http://bioinfo2.ugr.es/MethylExtract/ and http://sourceforge.net/projects/methylextract/, and also permanently accessible from 10.5281/zenodo.7144.


2016 ◽  
Vol 7 ◽  
Author(s):  
Feng Chen ◽  
Zibo Zhu ◽  
Xiaobian Zhou ◽  
Yan Yan ◽  
Zhongdong Dong ◽  
...  

Cell Reports ◽  
2018 ◽  
Vol 25 (9) ◽  
pp. 2369-2378.e4 ◽  
Author(s):  
Kyle Wolf ◽  
Tyler Hether ◽  
Pavlo Gilchuk ◽  
Amrendra Kumar ◽  
Ahmad Rajeh ◽  
...  

2021 ◽  
Author(s):  
Inga Usher ◽  
Lorena Ligammari ◽  
Sara Ahrabi ◽  
Emily Hepburn ◽  
Calum Connolly ◽  
...  

Single nucleotide variants are the commonest genetic alterations in the human genome. At least 60,000 have been reported to be associated with disease. The CRISPR/Cas9 system has transformed genetic research, making it possible to edit single nucleotides and study the function of genetic variants in vitro. While significant advances have improved the efficiency of CRISPR/Cas9, the editing of single nucleotides remains challenging. There are two major obstacles: low efficiency of accurate editing and the isolation of these cells from a pool of cells with other editing outcomes. We present data from 85 transfections of induced pluripotent stem cells and an immortalised cell line, comparing the effects of altering CRISPR/Cas9 design and experimental conditions on rates of single nucleotide substitution. We targeted variants in TP53, which predispose to several cancers, and in TBXT which is implicated in the pathogenesis of the bone cancer, chordoma. We describe a scalable and adaptable workflow for single nucleotide editing that incorporates contemporary techniques including Illumina MiSeq sequencing, TaqMan qPCR and digital droplet PCR for screening transfected cells as well as quality control steps to mitigate against common pitfalls. This workflow can be applied to CRISPR/Cas9 and other genome editing systems to maximise experimental efficiency.


2021 ◽  
Author(s):  
ZHIYONG Chen ◽  
Yancen He ◽  
Yasir Iqbal ◽  
Yanlan Shi ◽  
Hongmei Huang ◽  
...  

Abstract Background: Miscanthus, which is a leading dedicated-energy grass in Europe and in parts of Asia, is expected to play a key role in the development of the future bioeconomy. However, due to its complex genetic background, it is difficult to investigate phylogenetic relationships and the evolution of gene function in this genus. Here, we investigated 50 Miscanthus germplasms: 1 female parent (M. lutarioriparius), 30 candidate male parents (M. lutarioriparius, M. sinensis, and M. sacchariflorus), and 19 offspring. We used high-throughput Specific-Locus Amplified Fragment sequencing (SLAF-seq) to identify informative single nucleotide polymorphisms (SNPs) in all germplasms.Results: We identified 800,081 SLAF tags, of which 160,368 were polymorphic. Each tag was 264–364 bp long. The obtained SNPs were used to investigate genetic relationships within Miscanthus. We constructed a phylogenetic tree of the 50 germplasms using the obtained SNPs, and found that the germplasms fell into two clades: one clade of M. sinensis only and one clade that included the offspring, M. lutarioriparius, and M. sacchariflorus. Genetic cluster analysis indicated that M. lutarioriparius germplasm C3 was the most likely male parent of the offspring.Conclusions: As a high-throughput sequencing method, SLAF-seq can be used to identify informative SNPs in Miscanthus germplasms and to rapidly characterize genetic relationships within this genus. Our results will support the development of breeding programs utilizing Miscanthus cultivars with elite biomass- or fiber-production potential.


Sign in / Sign up

Export Citation Format

Share Document