High-Throughput Sequencing Reveals Single Nucleotide Variants in Longer-Kernel Bread Wheat

The discrepancy among single nucleotide variants detected by DNA and RNA high throughput sequencing data

BMC Genomics ◽

10.1186/s12864-017-4022-x ◽

2017 ◽

Vol 18 (S6) ◽

Cited By ~ 16

Author(s):

Yan Guo ◽

Shilin Zhao ◽

Quanhu Sheng ◽

David C Samuels ◽

Yu Shyr

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Sequencing Data ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Dna And Rna ◽

High Throughput Sequencing Data

Download Full-text

MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data

F1000Research ◽

10.12688/f1000research.2-217.v2 ◽

2014 ◽

Vol 2 ◽

pp. 217 ◽

Cited By ~ 8

Author(s):

Guillermo Barturen ◽

Antonio Rueda ◽

José L. Oliver ◽

Michael Hackenberg

Keyword(s):

High Throughput ◽

Sequence Variation ◽

High Throughput Sequencing ◽

Whole Genome ◽

Single Nucleotide Variants ◽

High Quality ◽

Single Nucleotide ◽

Error Sources ◽

Link Type ◽

Genome Methylation

Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants.We developed MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources. MethylExtract detects variation (SNVs – Single Nucleotide Variants) in a similar way to VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called Bis-SNP.MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at http://bioinfo2.ugr.es/MethylExtract/ and http://sourceforge.net/projects/methylextract/, and also permanently accessible from 10.5281/zenodo.7144.

Download Full-text

MethylExtract: High-Quality methylation maps and SNV calling from whole genome bisulfite sequencing data

F1000Research ◽

10.12688/f1000research.2-217.v1 ◽

2013 ◽

Vol 2 ◽

pp. 217 ◽

Cited By ~ 18

Author(s):

Guillermo Barturen ◽

Antonio Rueda ◽

José L. Oliver ◽

Michael Hackenberg

Keyword(s):

High Throughput ◽

Sequence Variation ◽

High Throughput Sequencing ◽

Whole Genome ◽

Single Nucleotide Variants ◽

High Quality ◽

Single Nucleotide ◽

Error Sources ◽

Link Type ◽

Genome Methylation

Whole genome methylation profiling at a single cytosine resolution is now feasible due to the advent of high-throughput sequencing techniques together with bisulfite treatment of the DNA. To obtain the methylation value of each individual cytosine, the bisulfite-treated sequence reads are first aligned to a reference genome, and then the profiling of the methylation levels is done from the alignments. A huge effort has been made to quickly and correctly align the reads and many different algorithms and programs to do this have been created. However, the second step is just as crucial and non-trivial, but much less attention has been paid to the final inference of the methylation states. Important error sources do exist, such as sequencing errors, bisulfite failure, clonal reads, and single nucleotide variants.We developed MethylExtract, a user friendly tool to: i) generate high quality, whole genome methylation maps and ii) detect sequence variation within the same sample preparation. The program is implemented into a single script and takes into account all major error sources. MethylExtract detects variation (SNVs – Single Nucleotide Variants) in a similar way to VarScan, a very sensitive method extensively used in SNV and genotype calling based on non-bisulfite-treated reads. The usefulness of MethylExtract is shown by means of extensive benchmarking based on artificial bisulfite-treated reads and a comparison to a recently published method, called Bis-SNP.MethylExtract is able to detect SNVs within High-Throughput Sequencing experiments of bisulfite treated DNA at the same time as it generates high quality methylation maps. This simultaneous detection of DNA methylation and sequence variation is crucial for many downstream analyses, for example when deciphering the impact of SNVs on differential methylation. An exclusive feature of MethylExtract, in comparison with existing software, is the possibility to assess the bisulfite failure in a statistical way. The source code, tutorial and artificial bisulfite datasets are available at http://bioinfo2.ugr.es/MethylExtract/ and http://sourceforge.net/projects/methylextract/, and also permanently accessible from 10.5281/zenodo.7144.

Download Full-text

A Systematic Evaluation of High-Throughput Sequencing Approaches to Identify Low-Frequency Single Nucleotide Variants in Viral Populations

Viruses ◽

10.3390/v12101187 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1187

Author(s):

David J. King ◽

Graham Freimanis ◽

Lidia Lasecka-Dykes ◽

Amin Asfor ◽

Paolo Ribeca ◽

...

Keyword(s):

Nucleic Acid ◽

High Throughput ◽

High Throughput Sequencing ◽

Low Frequency ◽

Illumina Miseq ◽

Read Length ◽

Systematic Evaluation ◽

Single Nucleotide Variants ◽

Single Nucleotide ◽

Dna And Rna

High-throughput sequencing such as those provided by Illumina are an efficient way to understand sequence variation within viral populations. However, challenges exist in distinguishing process-introduced error from biological variance, which significantly impacts our ability to identify sub-consensus single-nucleotide variants (SNVs). Here we have taken a systematic approach to evaluate laboratory and bioinformatic pipelines to accurately identify low-frequency SNVs in viral populations. Artificial DNA and RNA “populations” were created by introducing known SNVs at predetermined frequencies into template nucleic acid before being sequenced on an Illumina MiSeq platform. These were used to assess the effects of abundance and starting input material type, technical replicates, read length and quality, short-read aligner, and percentage frequency thresholds on the ability to accurately call variants. Analyses revealed that the abundance and type of input nucleic acid had the greatest impact on the accuracy of SNV calling as measured by a micro-averaged Matthews correlation coefficient score, with DNA and high RNA inputs (107 copies) allowing for variants to be called at a 0.2% frequency. Reduced input RNA (105 copies) required more technical replicates to maintain accuracy, while low RNA inputs (103 copies) suffered from consensus-level errors. Base errors identified at specific motifs identified in all technical replicates were also identified which can be excluded to further increase SNV calling accuracy. These findings indicate that samples with low RNA inputs should be excluded for SNV calling and reinforce the importance of optimising the technical and bioinformatics steps in pipelines that are used to accurately identify sequence variants.

Download Full-text

Investigation of Genetic Relationships within Miscanthus using SNP Markers Identified using SLAF-Seq

10.21203/rs.3.rs-152687/v1 ◽

2021 ◽

Author(s):

ZHIYONG Chen ◽

Yancen He ◽

Yasir Iqbal ◽

Yanlan Shi ◽

Hongmei Huang ◽

...

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Genetic Relationships ◽

Female Parent ◽

Snp Markers ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Breeding Programs ◽

Sequencing Method ◽

Specific Locus

Abstract Background: Miscanthus, which is a leading dedicated-energy grass in Europe and in parts of Asia, is expected to play a key role in the development of the future bioeconomy. However, due to its complex genetic background, it is difficult to investigate phylogenetic relationships and the evolution of gene function in this genus. Here, we investigated 50 Miscanthus germplasms: 1 female parent (M. lutarioriparius), 30 candidate male parents (M. lutarioriparius, M. sinensis, and M. sacchariflorus), and 19 offspring. We used high-throughput Specific-Locus Amplified Fragment sequencing (SLAF-seq) to identify informative single nucleotide polymorphisms (SNPs) in all germplasms.Results: We identified 800,081 SLAF tags, of which 160,368 were polymorphic. Each tag was 264–364 bp long. The obtained SNPs were used to investigate genetic relationships within Miscanthus. We constructed a phylogenetic tree of the 50 germplasms using the obtained SNPs, and found that the germplasms fell into two clades: one clade of M. sinensis only and one clade that included the offspring, M. lutarioriparius, and M. sacchariflorus. Genetic cluster analysis indicated that M. lutarioriparius germplasm C3 was the most likely male parent of the offspring.Conclusions: As a high-throughput sequencing method, SLAF-seq can be used to identify informative SNPs in Miscanthus germplasms and to rapidly characterize genetic relationships within this genus. Our results will support the development of breeding programs utilizing Miscanthus cultivars with elite biomass- or fiber-production potential.

Download Full-text

The Phenolyzer Suite: Prioritizing the Candidate Genes Involved in Microtia

Annals of Otology Rhinology & Laryngology ◽

10.1177/0003489419840052 ◽

2019 ◽

Vol 128 (6) ◽

pp. 556-562 ◽

Cited By ~ 1

Author(s):

Huang Xin ◽

Wang Changchen ◽

Liu Lei ◽

Yang Meirong ◽

Zhang Ye ◽

...

Keyword(s):

Candidate Genes ◽

High Throughput ◽

Potential Candidate ◽

Single Nucleotide Variants ◽

Gene Score ◽

Single Nucleotide ◽

Research Directions ◽

Score System ◽

Pathogenic Genes ◽

First Time

Objective: Microtia is a congenital malformation of the external ear. Great progress about the genetic of microtia has been made in recent years. This article was to prioritize the potential candidate pathogenic genes of microtia based on existing studies and reports, with the purpose of narrowing the range of following study scientifically and quickly. Method: A computational tool called Phenolyzer (phenotype-based gene analyzer) was used to prioritize microtia genes. Microtia, as a query term, was input in the interface of Phenolyzer. After several steps, including disease match, gene query, gene score system, seed gene growth, and gene ranking, the final results about genetic information of microtia were provided. Then we tracked details of the top 10 genes ranked by Phenolyzer on the basis of previous reports. Results: We detected 10 348 genes associated with microtia or related syndromes, and 78 genes of those genes belonged to seed genes. Every gene was given a score, and the gene with higher scores was more likely influence microtia. The top 10 ranked genes included HOXA2, CHD7, CDT1, ORC1, ORC4, ORC6, CDC6, MED12, TWIST1, and GLI3. Otherwise, four gene-gene interactions were displayed. Conclusion: This article prioritized candidate genes of microtia for the first time. High-throughput methods provide tens of thousands of single-nucleotide variants, indels, and structural variants, and only a handful are relevant to microtia or associated syndromes. Combine the ranked potential pathogenic genes list from Phenolyzer with the results of samples provided by high-throughput methods, and more precise research directions are presented.

Download Full-text

11 Million SNP Reference Profiles for Identity Searching Across Ethnicities, Kinship, and Admixture

10.1101/321190 ◽

2018 ◽

Author(s):

Brian S. Helfer ◽

Darrell O. Ricke

Keyword(s):

Single Nucleotide Polymorphisms ◽

High Throughput ◽

Family Relationships ◽

In Silico ◽

High Throughput Sequencing ◽

Genetic Data ◽

Nucleotide Polymorphisms ◽

Mixture Analysis ◽

Single Nucleotide ◽

Public Repositories

AbstractHigh throughput sequencing (HTS) of single nucleotide polymorphisms (SNPs) provides additional applications for DNA forensics including identification, mixture analysis, kinship prediction, and biogeographic ancestry prediction. Public repositories of human genetic data are being rapidly generated and released, but the majorities of these samples are de-identified to protect privacy, and have little or no individual metadata such as appearance (photos), ethnicity, relatives, etc. A reference in silico dataset has been generated to enable development and testing of new DNA forensics algorithms. This dataset provides 11 million SNP profiles for individuals with defined ethnicities and family relationships spanning eight generations with admixture for a panel with 39,108 SNPs.

Download Full-text

Episo: quantitative estimation of RNA 5-methylcytosine at isoform level by high-throughput sequencing of RNA treated with bisulfite

Bioinformatics ◽

10.1093/bioinformatics/btz900 ◽

2019 ◽

Vol 36 (7) ◽

pp. 2033-2039 ◽

Cited By ~ 2

Author(s):

Junfeng Liu ◽

Ziyang An ◽

Jianjun Luo ◽

Jing Li ◽

Feifei Li ◽

...

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Quantitative Estimation ◽

Supplementary Information ◽

Biological Processes ◽

Single Nucleotide ◽

Rna Immunoprecipitation ◽

Nucleotide Resolution ◽

Human And Mouse ◽

Single Nucleotide Resolution

Abstract Motivation RNA 5-methylcytosine (m5C) is a type of post-transcriptional modification that may be involved in numerous biological processes and tumorigenesis. RNA m5C can be profiled at single-nucleotide resolution by high-throughput sequencing of RNA treated with bisulfite (RNA-BisSeq). However, the exploration of transcriptome-wide profile and potential function of m5C in splicing remains to be elucidated due to lack of isoform level m5C quantification tool. Results We developed a computational package to quantify Epitranscriptomal RNA m5C at the transcript isoform level (named Episo). Episo consists of three tools: mapper, quant and Bisulfitefq, for mapping, quantifying and simulating RNA-BisSeq data, respectively. The high accuracy of Episo was validated using an improved m5C-specific methylated RNA immunoprecipitation (meRIP) protocol, as well as a set of in silico experiments. By applying Episo to public human and mouse RNA-BisSeq data, we found that the RNA m5C is not evenly distributed among the transcript isoforms, implying the m5C may subject to be regulated at isoform level. Availability and implementation Episo is released under the GNU GPLv3+ license. The resource code Episo is freely accessible from https://github.com/liujunfengtop/Episo (with Tophat/cufflink) and https://github.com/liujunfengtop/Episo/tree/master/Episo_Kallisto (with Kallisto). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Computational detection, analysis and interpretations of genomic variants in human diseases associated GENEMDM 2

Biomedical Letters ◽

10.47262/bl/7.2.20210705 ◽

2021 ◽

Vol 7 (2) ◽

pp. 141-154

Keyword(s):

Computational Methods ◽

High Throughput Sequencing ◽

Cancer Susceptibility ◽

In Silico Analysis ◽

Human Diseases ◽

Nucleotide Polymorphisms ◽

Single Nucleotide Variants ◽

Mdm2 Gene ◽

Single Nucleotide ◽

Computational Tools

Most of the mutations described in human MDM2 are tolerated without significantly disrupting the corresponding structural or molecular function. However, some of them are associated with a variety of human diseases, including cancer. Numerous computational methods have been developed to predict the effects of missense single nucleotide variants (SNVs). The non-synonymous single nucleotide polymorphisms affect the function of XRCC1, which impairs the ability to repair DNA and therefore increases the risk of diseases such as cancer. In this study, sequence and structure-based computational tools were used to screen the total listed coding SNPs of the MDM2 gene in order to recognize and describe them. The potential 6 ns SNP of MDM2 were identified from 29 ns SNP by consistent analysis using computational tools PolyPhen 2, SIFT, PANTHER and cSNP. The computational methods were used to systematically classify functional mutations in the regulatory and coding regions that modify the expression and function of the MDM2 enzyme. The HOPE project also made it possible to elaborate the structural effects of the substitutions of amino acids. In silico analysis predicted that rs759244097 is harmful. This study concluded that identifying this SNP will help to determine an individual's cancer susceptibility, prognosis and further treatment. Furthermore, current high-throughput sequencing efforts and the need for extensive interpretation of protein sequence variants requires more efficient and accurate computational methods in the coming years.

Download Full-text