Using off-target data from whole-exome sequencing to improve genotyping accuracy, association analysis and polygenic risk prediction

Author(s):  
Jinzhuang Dou ◽  
Degang Wu ◽  
Lin Ding ◽  
Kai Wang ◽  
Minghui Jiang ◽  
...  

Abstract Whole-exome sequencing (WES) has been widely used to study the role of protein-coding variants in genetic diseases. Non-coding regions, typically covered by sparse off-target data, are often discarded by conventional WES analyses. Here, we develop a genotype calling pipeline named WEScall to analyse both target and off-target data. We leverage linkage disequilibrium shared within study samples and from an external reference panel to improve genotyping accuracy. In an application to WES of 2527 Chinese and Malays, WEScall can reduce the genotype discordance rate from 0.26% (SE= 6.4 × 10−6) to 0.08% (SE = 3.6 × 10−6) across 1.1 million single nucleotide polymorphisms (SNPs) in the deeply sequenced target regions. Furthermore, we obtain genotypes at 0.70% (SE = 3.0 × 10−6) discordance rate across 5.2 million off-target SNPs, which had ~1.2× mean sequencing depth. Using this dataset, we perform genome-wide association studies of 10 metabolic traits. Despite of our small sample size, we identify 10 loci at genome-wide significance (P < 5 × 10−8), including eight well-established loci. The two novel loci, both associated with glycated haemoglobin levels, are GPATCH8-SLC4A1 (rs369762319, P = 2.56 × 10−12) and ROR2 (rs1201042, P = 3.24 × 10−8). Finally, using summary statistics from UK Biobank and Biobank Japan, we show that polygenic risk prediction can be significantly improved for six out of nine traits by incorporating off-target data (P < 0.01). These results demonstrate WEScall as a useful tool to facilitate WES studies with decent amounts of off-target data.

2017 ◽  
Vol 3 (5) ◽  
pp. e177 ◽  
Author(s):  
Javier Ruiz-Martínez ◽  
Luis J. Azcona ◽  
Alberto Bergareche ◽  
Jose F. Martí-Massó ◽  
Coro Paisán-Ruiz

Objective:Despite the enormous advancements made in deciphering the genetic architecture of Parkinson disease (PD), the majority of PD is idiopathic, with single gene mutations explaining only a small proportion of the cases.Methods:In this study, we clinically evaluated 2 unrelated Spanish families diagnosed with PD, in which known PD genes were previously excluded, and performed whole-exome sequencing analyses in affected individuals for disease gene identification.Results:Patients were diagnosed with typical PD without relevant distinctive symptoms. Two different novel mutations were identified in the CSMD1 gene. The CSMD1 gene, which encodes a complement control protein that is known to participate in the complement activation and inflammation in the developing CNS, was previously shown to be associated with the risk of PD in a genome-wide association study.Conclusions:We conclude that the CSMD1 mutations identified in this study might be responsible for the PD phenotype observed in our examined patients. This, along with previous reported studies, may suggest the complement pathway as an important therapeutic target for PD and other neurodegenerative diseases.


2017 ◽  
Vol 107 (2) ◽  
pp. 457-466.e9 ◽  
Author(s):  
Svetlana A. Yatsenko ◽  
Priya Mittal ◽  
Michelle A. Wood-Trageser ◽  
Mirka W. Jones ◽  
Urvashi Surti ◽  
...  

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Chiara Fabbri ◽  
Siegfried Kasper ◽  
Alexander Kautzky ◽  
Joseph Zohar ◽  
Daniel Souery ◽  
...  

2016 ◽  
Vol 15 ◽  
pp. CIN.S36612 ◽  
Author(s):  
Lun-Ching Chang ◽  
Biswajit Das ◽  
Chih-Jian Lih ◽  
Han Si ◽  
Corinne E. Camalier ◽  
...  

With rapid advances in DNA sequencing technologies, whole exome sequencing (WES) has become a popular approach for detecting somatic mutations in oncology studies. The initial intent of WES was to characterize single nucleotide variants, but it was observed that the number of sequencing reads that mapped to a genomic region correlated with the DNA copy number variants (CNVs). We propose a method RefCNV that uses a reference set to estimate the distribution of the coverage for each exon. The construction of the reference set includes an evaluation of the sources of variability in the coverage distribution. We observed that the processing steps had an impact on the coverage distribution. For each exon, we compared the observed coverage with the expected normal coverage. Thresholds for determining CNVs were selected to control the false-positive error rate. RefCNV prediction correlated significantly ( r = 0.96–0.86) with CNV measured by digital polymerase chain reaction for MET (7q31), EGFR (7p12), or ERBB2 (17q12) in 13 tumor cell lines. The genome-wide CNV analysis showed a good overall correlation (Spearman's coefficient = 0.82) between RefCNV estimation and publicly available CNV data in Cancer Cell Line Encyclopedia. RefCNV also showed better performance than three other CNV estimation methods in genome-wide CNV analysis.


2021 ◽  
Author(s):  
Jayant Mahadevan ◽  
Ajai Kumar Pathak ◽  
Alekhya Vemula ◽  
Ravi Kumar Nadella ◽  
Biju Viswanath ◽  
...  

Evolutionary trends may underlie some aspects of the risk for common, non-communicable disorders, including psychiatric disease. We analyzed whole exome sequencing data from 80 unique individuals from India coming from families with two or more individuals with severe mental illness. We used Population Branch Statistics (PBS) to identify variants and genes under positive selection and identified 75 genes as candidates for positive selection. Of these, 20 were previously associated with Schizophrenia, Alzheimers disease and cognitive abilities in genome wide association studies. We then checked whether any of these 75 genes were involved in common biological pathways or related to specific cellular or molecular functions. We found that immune related pathways and functions related to innate immunity such as antigen binding were over-represented. We also evaluated for the presence of Neanderthal introgressed segments in these genes and found Neanderthal introgression in a single gene out of the 75 candidate genes. However, the introgression pattern indicates the region is unlikely to be the source for selection. Our findings hint at how selection pressures in individuals from families with a history of severe mental illness may diverge from the general population. Further, it also provides insights into the genetic architecture of severe mental illness, such as schizophrenia and its link to immune factors.


2020 ◽  
Author(s):  
Lin Zhang ◽  
Zheng Cao ◽  
Fan Feng ◽  
Ya-Nan Xu ◽  
LIN LI ◽  
...  

Abstract Background This study wants to know the genetic cause of preeclampsia (PE) which is a leading cause of maternal and perinatal death, but the underlying molecular mechanisms that cause PE remain poorly understood. Many single nucleotide polymorphisms have been identified by genome-wide association studies and were found to be associated with PE; however, few studies have used whole-exome sequencing (WES) to identify PE mutations. Methods Five patients with severe early-onset preeclampsia (EOPE) were recruited, and WES was performed on each patient. Sanger sequencing was used to confirm the potential causative genetic mutation. Results After a stringent bioinformatics analysis, a rare mutation in the GOT1 gene, c.44C>G:p.P15R, was found in one patient. Bioinformatics analysis showed that the mutation site is highly conserved across several species and was predicted to be a pathogenic mutation according to several online mutational function prediction software packages. Further structural biology homology modeling suggested that P15R would change the electric environment of enzymatic center, and might affect the binding affinity of substrate or product. Conclusion We demonstrated for the first time that the mutation in GOT1 may be associated with EOPE, the results of this study provide researchers and clinicians with a better understanding of the molecular mechanisms that underlie maternal severe EOPE.


2019 ◽  
Vol 186 (4) ◽  
pp. 574-579
Author(s):  
Rami Khoriaty ◽  
Ayse B. Ozel ◽  
Shweta Ramdas ◽  
Charles Ross ◽  
Karl Desch ◽  
...  

2019 ◽  
Author(s):  
Mei Sim Lung ◽  
Catherine A. Mitchell ◽  
Maria A. Doyle ◽  
Andrew C. Lynch ◽  
Kylie L. Gorringe ◽  
...  

Abstract Background Familial cases of appendiceal mucinous tumours (AMTs) are extremely rare and the underlying genetic aetiology uncertain. We identified potential predisposing germline genetic variants in a father and daughter with AMTs presenting with pseudomyxoma peritonei (PMP) and correlated these with regions of loss of heterozygosity (LOH) in the tumours. Methods Through germline whole exome sequencing, we identified novel heterozygous loss-of-function (LoF) (i.e. nonsense, frameshift and essential splice site mutations) and missense variants shared between father and daughter, and validated all LoF variants, and missense variants with a Combined Annotation Dependent Depletion (CADD) scaled score of ≥10. Genome-wide copy number analysis was performed on tumour tissue from both individuals to identify regions of LOH. Results Fifteen novel variants in 15 genes were shared by the father and daughter, including a nonsense mutation in REEP5. None of these germline variants were located in tumour regions of LOH shared by the father and daughter. Four genes ( EXOG , RANBP2, RANBP6 and TNFRSF1B ) harboured missense variants that fell in a region of LOH in the tumour from the father only, but none showed somatic loss of the wild type allele in the tumour. The REEP5 gene was sequenced in 23 individuals with presumed sporadic AMTs or PMP; no LoF or rare missense germline variants were identified. Conclusion Germline exome sequencing of a father and daughter with AMTs identified novel candidate predisposing genes. Further studies are required to clarify the role of these genes in familial AMTs.


Sign in / Sign up

Export Citation Format

Share Document