2278

Alistair N. Ward; Matt Velinder; Chase Miller; Tony Di Sera; Yi Qiao; Dave Viskochil; Gabor Marth

doi:10.1017/cts.2017.64

2278

Journal of Clinical and Translational Science ◽

10.1017/cts.2017.64 ◽

2017 ◽

Vol 1 (S1) ◽

pp. 13-14

Author(s):

Alistair N. Ward ◽

Matt Velinder ◽

Chase Miller ◽

Tony Di Sera ◽

Yi Qiao ◽

...

Keyword(s):

Real Time ◽

Protein Function ◽

Variant Calling ◽

Compound Heterozygote ◽

Treacher Collins Syndrome ◽

Whole Genome Sequencing Data ◽

Missense Mutations ◽

Sequencing Data ◽

Web Based ◽

Functional Studies

OBJECTIVES/SPECIFIC AIMS: The objective of the study was 2-fold; to identify potentially deleterious alleles in a child with Treacher Collins syndrome, and; to demonstrate the value of the iobio analysis platform for intuitively and rapidly analyzing genomic data. METHODS/STUDY POPULATION: We used the iobio suite of web-based applications to analyze quality metrics for the sequencing data and called variants for the proband and his parents. We then visually interrogated variants in genes potentially associated with the syndrome in real-time, using the intuitive gene.iobio application. We sought high impact variants that demonstrated a predicted impact on the protein function, and were simultaneously at low allele frequency in the general human population. Variants were also compared against the ClinVar database of known mutations to identify variants that have already been associated with this, or related syndromes in the literature or clinical studies. Finally, the gene.iobio tool allows users to interrogate the primary sequencing data to ensure that no variants had been missed by the primary variant calling pipeline. This analysis pipeline was performed using intuitive web-based apps in real time, and consequently represents a system that is available to users that traditionally are excluded from these analyses. RESULTS/ANTICIPATED RESULTS: The iobio suite was used to rapidly assess data quality and interrogate genetic variants for a child with Treacher Collins syndrome. A compound heterozygote consisting of 2 missense alleles in the TCOF1 gene was identified as a compelling pathogenic allele, necessitating further functional investigation. The study helped validate the use of the intuitive iobio tools in such analyses, strengthening the case for greater involvement of medical professionals in data analysis. DISCUSSION/SIGNIFICANCE OF IMPACT: The performed analyses demonstrated that the whole genome sequencing data for the family being studied was of a very high quality, although 1 gene demonstrated a local region of almost zero coverage. This ensured that study conclusions can be presented with confidence. A variant associated with Treacher Collins syndrome 1 in ClinVar was uncovered in the TCOF1 gene, however, given it’s benign rating, this variant was not considered further. The most interesting candidate was a compound heterozygote, consisting of 2 missense mutations, also in the TCOF1 gene. These mutations occurred with allele frequencies of 22% and 8% in the general population, and additional molecular and functional studies are currently being pursued.

Download Full-text

Are Mutations in the DHRS9 Gene Causally Linked to Epilepsy? A Case Report

Medicina ◽

10.3390/medicina56080387 ◽

2020 ◽

Vol 56 (8) ◽

pp. 387

Author(s):

Francesco Calì ◽

Maurizio Elia ◽

Mirella Vinci ◽

Luigi Vetri ◽

Edvige Correnti ◽

...

Keyword(s):

Case Report ◽

Protein Function ◽

Early Onset ◽

Neuronal Excitability ◽

Compound Heterozygote ◽

Aminobutyric Acid ◽

Missense Mutations ◽

Gamma Aminobutyric Acid ◽

Whole Exome ◽

Positive Modulator

The DHRS9 gene is involved in several pathways including the synthesis of allopregnanolone from progesterone. Allopregnanolone is a positive modulator of gamma aminobutyric acid (GABA) action and plays a role in the control of neuronal excitability and seizures. Whole-exome sequencing performed on a girl with an early onset epilepsy revealed that she was a compound heterozygote for two novel missense mutations of the DHRS9 gene likely to disrupt protein function. No previous studies have reported the implication of this gene in epilepsy. We discuss a new potential pathogenic mechanism underlying epilepsy in a child, due to a defective progesterone pathway.

Download Full-text

Seven novel glucose-6-phosphate dehydrogenase (G6PD) deficiency variants identified in the Qatari population

Human Genomics ◽

10.1186/s40246-021-00358-9 ◽

2021 ◽

Vol 15 (1) ◽

Author(s):

Shaza Malik ◽

Roan Zaied ◽

Najeeb Syed ◽

Puthen Jithesh ◽

Mashael Al-Shafai

Keyword(s):

Protein Function ◽

World Health ◽

Whole Genome Sequencing Data ◽

The Novel ◽

Class Iii ◽

Sequencing Data ◽

Novel Variants ◽

The World ◽

Glucose 6 Phosphate Dehydrogenase ◽

The Impact

Abstract Background Glucose-6-phosphate dehydrogenase deficiency (G6PDD) is the most common red cell enzymopathy in the world. In Qatar, the incidence of G6PDD is estimated at around 5%; however, no study has investigated the genetic basis of G6PDD in the Qatari population yet. Methods In this study, we analyzed whole-genome sequencing data generated by the Qatar Genome Programme for 6045 Qatar Biobank participants, to identify G6PDD variants in the Qatari population. In addition, we assessed the impact of the novel variants identified on protein function both in silico and by measuring G6PD enzymatic activity in the subjects carrying them. Results We identified 375 variants in/near G6PD gene, of which 20 were high-impact and 16 were moderate-impact variants. Of these, 14 were known G6PDD-causing variants. The most frequent G6PD-causing variants found in the Qatari population were p.Ser188Phe (G6PD Mediterranean), p.Asn126Asp (G6PD A +), p.Val68Met (G6PD Asahi), p.Ala335Thr (G6PD Chatham), and p.Ile48Thr (G6PD Aures) with allele frequencies of 0.0563, 0.0194, 0.00785, 0.0050, and 0.00380, respectively. Furthermore, we have identified seven novel G6PD variants, all of which were confirmed as G6PD-causing variants and classified as class III variants based on the World Health Organization’s classification scheme. Conclusions This is the first study investigating the molecular basis of G6PDD in Qatar, and it provides novel insights about G6PDD pathogenesis and highlights the importance of studying such understudied population.

Download Full-text

Shiny-SoSV: A web-based performance calculator for somatic structural variant detection

10.1101/668723 ◽

2019 ◽

Author(s):

Tingting Gong ◽

Vanessa M Hayes ◽

Eva KF Chan

Keyword(s):

Study Design ◽

Additive Model ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Web Based ◽

Generalised Additive Model ◽

Structural Variant ◽

Variant Detection ◽

User Friendly ◽

The Impact

AbstractSomatic structural variants are an important contributor to cancer development and evolution. Accurate detection of these complex variants from whole genome sequencing data is influenced by a multitude of parameters. However, there are currently no tools for guiding study design nor are there applications that could predict the performance of somatic structural variant detection. To address this gap, we developed Shiny-SoSV, a user-friendly web-based calculator for determining the impact of common variables on the sensitivity and precision of somatic structural variant detection, including choice of variant detection tool, sequencing depth of coverage, variant allele fraction, and variant breakpoint resolution. Using simulation studies, we determined singular and combinatoric effects of these variables, modelled the results using a generalised additive model, allowing structural variant detection performance to be predicted for any combination of predictors. Shiny-SoSV provides an interactive and visual platform for users to easily compare individual and combined impact of different parameters. It predicts the performance of a proposed study design, on somatic structural variant detection, prior to the commencement of benchwork. Shiny-SoSV is freely available at https://hcpcg.shinyapps.io/Shiny-SoSV with accompanying user’s guide and example use-cases.

Download Full-text

SMuRF: Portable and accurate ensemble-based somatic variant calling

10.1101/270413 ◽

2018 ◽

Cited By ~ 2

Author(s):

Weitai Huang ◽

Yu Amanda Guo ◽

Karthik Muthukumar ◽

Probhonjon Baruah ◽

Meimei Chang ◽

...

Keyword(s):

Point Mutations ◽

Variant Calling ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Somatic Variant ◽

Level Data ◽

Machine Learning Approach ◽

Cancer Types ◽

User Friendly ◽

Improved Accuracy

ABSTARCTSummarySMuRF is an ensemble method for prediction of somatic point mutations (SNVs) and small insertions/deletions (indels) in cancer genomes. The method integrates predictions and auxiliary features from different somatic mutation callers using a Random Forest machine learning approach. SMuRF is trained on community-curated tumor whole genome sequencing data, is robust across cancer types, and achieves improved accuracy for both SNV and indel predictions of genome and exome-level data. The software is user-friendly and portable by design, operating as an add-on to the community-developed bcbio-nextgen somatic variant calling [email protected]

Download Full-text

GRIDSS, PURPLE, LINX: Unscrambling the tumor genome via integrated analysis of structural variation and copy number

10.1101/781013 ◽

2019 ◽

Cited By ~ 8

Author(s):

Daniel L. Cameron ◽

Jonathan Baber ◽

Charles Shale ◽

Anthony T. Papenfuss ◽

Jose Espejo Valle-Inclan ◽

...

Keyword(s):

Copy Number ◽

Variant Calling ◽

Genomic Rearrangements ◽

Whole Genome Sequencing Data ◽

Integrated Analysis ◽

Derivative Chromosome ◽

Structural Variants ◽

Sequencing Data ◽

Structural Variant ◽

Complex Events

AbstractWe have developed a novel, integrated and comprehensive purity, ploidy, structural variant and copy number somatic analysis toolkit for whole genome sequencing data of paired tumor/normal samples. We show that the combination of using GRIDSS for somatic structural variant calling and PURPLE for somatic copy number alteration calling allows highly sensitive, precise and consistent copy number and structural variant determination, as well as providing novel insights for short structural variants and regions of complex local topology. LINX, an interpretation tool, leverages the integrated structural variant and copy number calling to cluster individual structural variants into higher order events and chains them together to predict local derivative chromosome structure. LINX classifies and extensively annotates genomic rearrangements including simple and reciprocal breaks, LINE, viral and pseudogene insertions, and complex events such as chromothripsis. LINX also comprehensively calls genic fusions including chained fusions. Finally, our toolkit provides novel visualisation methods providing insight into complex genomic rearrangements.

Download Full-text

TBIO-12. THE SPECTRUM OF MITOCHONDRIAl DNA (mtDNA) MUTATIONS IN PEDIATRIC CENTRAL NERVOUS SYSTEM (CNS) TUMORS

Neuro-Oncology ◽

10.1093/neuonc/noaa222.839 ◽

2020 ◽

Vol 22 (Supplement_3) ◽

pp. iii468-iii469

Author(s):

Kristiyana Kaneva ◽

Petr Triska ◽

Daria Merkurjev ◽

Moiz Bootwalla ◽

Jennifer Cotter ◽

...

Keyword(s):

Mitochondrial Dna ◽

Cns Tumors ◽

Whole Genome Sequencing Data ◽

Low Grade ◽

Missense Mutations ◽

Loss Of Function ◽

Sequencing Data ◽

Mtdna Mutation ◽

Tumor Subtypes ◽

Mtdna Mutations

Abstract To explore the role of mitochondrial DNA mutations in pediatric CNS tumors, we analyzed 749 tumor-normal paired whole genome sequencing data sets from the Children’s Brain Tumor Tissue Consortium (CBTTC). We detected 307 somatic mtDNA mutations in 222 CNS tumors (29.6%). Most frequently observed were missense mutations (38.1%). We also detected 34 loss-of-function mutations. Different pediatric CNS tumor subtypes have distinct mtDNA mutation profiles. For categorical comparisons, we analyzed subtypes with at least 15 samples. The highest number of mtDNA mutations per tumor sample was in meningiomas (0.85), while atypical teratoid rhabdoid tumors (ATRTs) had the lowest number per sample (0.18). High-grade gliomas had a higher number of mtDNA mutations per sample than low-grade gliomas (0.56 vs. 0.31) (p = 0.0011), with almost twice as many missense mtDNA mutations per sample (0.22 vs. 0.13) (p < 0.001), and higher average heteroplasmy levels (11% vs. 9%). The average heteroplasmy was 10.1%, ranging from 15.6% in medulloblastoma to 6.36% in schwannoma suggesting that these are clonal alterations and not artifacts. Intriguingly, the two chordoma patients in the CBTTC database had an identical heteroplasmic m.10971G>A MT-ND4 nonsense mutation. Similarly, our patient with recurrent gliofibroma harbored the same somatic MT-ND4 synonymous variant (m.10700A>G) detected at 53% heteroplasmy in the initial tumor, 79% in the first recurrence, and 97% in the second recurrence. Although the functional consequences of these alterations are not yet understood, our findings suggest that sequencing the mtDNA genome may be used to characterize CNS tumors at diagnosis and monitor disease progression.

Download Full-text

SMuRF: portable and accurate ensemble prediction of somatic mutations

Bioinformatics ◽

10.1093/bioinformatics/btz018 ◽

2019 ◽

Vol 35 (17) ◽

pp. 3157-3159 ◽

Cited By ~ 5

Author(s):

Weitai Huang ◽

Yu Amanda Guo ◽

Karthik Muthukumar ◽

Probhonjon Baruah ◽

Mei Mei Chang ◽

...

Keyword(s):

Somatic Mutation ◽

Variant Calling ◽

High Accuracy ◽

Supervised Machine Learning ◽

Supplementary Information ◽

Ensemble Prediction ◽

Whole Genome Sequencing Data ◽

Sequencing Data ◽

Machine Learning Approach ◽

Somatic Mutation Calling

Abstract Summary Somatic Mutation calling method using a Random Forest (SMuRF) integrates predictions and auxiliary features from multiple somatic mutation callers using a supervised machine learning approach. SMuRF is trained on community-curated matched tumor and normal whole genome sequencing data. SMuRF predicts both SNVs and indels with high accuracy in genome or exome-level sequencing data. Furthermore, the method is robust across multiple tested cancer types and predicts low allele frequency variants with high accuracy. In contrast to existing ensemble-based somatic mutation calling approaches, SMuRF works out-of-the-box and is orders of magnitudes faster. Availability and implementation The method is implemented in R and available at https://github.com/skandlab/SMuRF. SMuRF operates as an add-on to the community-developed bcbio-nextgen somatic variant calling pipeline. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Comparison of three variant callers for human whole genome sequencing

10.1101/461798 ◽

2018 ◽

Author(s):

Anna Supernat ◽

Oskar Valdimar Vidarsson ◽

Vidar M. Steen ◽

Tomasz Stokowy

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Single Gene ◽

Reference Sample ◽

Variant Calling ◽

Whole Genome Sequencing Data ◽

Whole Genome ◽

Sequencing Data ◽

Whole Exome ◽

Indel Calling

ABSTRACTTesting of patients with genetics-related disorders is in progress of shifting from single gene assays to gene panel sequencing, whole-exome sequencing (WES) and whole-genome sequencing (WGS). Since WGS is unquestionably becoming a new foundation for molecular analyses, we decided to compare three currently used tools for variant calling of human whole genome sequencing data. We tested DeepVariant, a new TensorFlow machine learning-based variant caller, and compared this tool to GATK 4.0 and SpeedSeq, using 30×, 15× and 10× WGS data of the well-known NA12878 DNA reference sample.According to our comparison, the performance on SNV calling was almost similar in 30× data, with all three variant callers reaching F-Scores (i.e. harmonic mean of recall and precision) equal to 0.98. In contrast, DeepVariant was more precise in indel calling than GATK and SpeedSeq, as demonstrated by F-Scores of 0.94, 0.90 and 0.84, respectively.We conclude that the DeepVariant tool has great potential and usefulness for analysis of WGS data in medical genetics.

Download Full-text

Can we distinguish modes of selective interactions using linkage disequilibrium?

10.1101/2021.03.25.437004 ◽

2021 ◽

Author(s):

Aaron P Ragsdale

Keyword(s):

Linkage Disequilibrium ◽

Allele Frequency ◽

Data Interpretation ◽

Numerical Approach ◽

Interactive Effects ◽

Whole Genome Sequencing Data ◽

Human Populations ◽

Missense Mutations ◽

Sequencing Data ◽

Selective Interactions

Selected mutations interfere and interact with evolutionary processes at nearby loci, distorting allele frequency trajectories and correlations between pairs of mutations. A number of recent studies have used patterns of linkage disequilibrium (LD) between selected variants to test for selective interference and epistatic interactions, with some disagreement over interpreting observations from data. Interpretation is hindered by the relative lack of analytic or even numerical expectations for patterns of variation between pairs of loci under the combined effects of selection, dominance, epistasis, and demography. Here, I develop a numerical approach to compute the expected two-locus sampling distribution under diploid selection with arbitrary epistasis and dominance, recombination, and variable population size. I use this to explore how epistasis and dominance affect expected signed LD, including for non-steady-state demography relevant to human populations. Finally, I use whole-genome sequencing data from humans to assess how well we can differentiate modes of selective interactions in practice. I find that positive LD between missense mutations within genes is driven by strong positive allele-frequency correlations between pairs of mutations that fall within the same conserved domain, pointing to compensatory mutations or antagonistic epistasis as the prevailing mode of interaction within but not outside of conserved genic elements. The heterogeneous landscape of both mutational fitness effects and selective interactions within protein-coding genes calls for more refined inferences of the joint distribution of fitness and interactive effects, and the methods presented here should prove useful in that pursuit.

Download Full-text

Rapid clinical diagnostic variant investigation of genomic patient sequencing data with iobio web tools

Journal of Clinical and Translational Science ◽

10.1017/cts.2017.311 ◽

2017 ◽

Vol 1 (6) ◽

pp. 381-386 ◽

Cited By ~ 5

Author(s):

Alistair Ward ◽

Mary A. Karren ◽

Tonya Di Sera ◽

Chase Miller ◽

Matt Velinder ◽

...

Keyword(s):

Genetic Testing ◽

Real Time ◽

Computational Analysis ◽

Disease Diagnosis ◽

Epileptic Encephalopathy ◽

Inherited Disease ◽

Sequencing Data ◽

Web Based ◽

Web Tools ◽

Clinical Diagnostic

IntroductionComputational analysis of genome or exome sequences may improve inherited disease diagnosis, but is costly and time-consuming.MethodsWe describe the use of iobio, a web-based tool suite for intuitive, real-time genome diagnostic analyses.ResultsWe used iobio to identify the disease-causing variant in a patient with early infantile epileptic encephalopathy with prior nondiagnostic genetic testing.ConclusionsIobio tools can be used by clinicians to rapidly identify disease-causing variants from genomic patient sequencing data.

Download Full-text