A general framework for predicting the transcriptomic consequences of non-coding variation

Mapping Intimacies ◽

10.1101/279323 ◽

2018 ◽

Cited By ~ 1

Author(s):

Moustafa Abdalla ◽

Mohamed Abdalla ◽

Mark I. McCarthy ◽

Chris C. Holmes

Keyword(s):

Complex Traits ◽

Linear Models ◽

Association Studies ◽

Transcript Abundance ◽

Evolutionary Constraint ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Genotype Variation ◽

Functional Understanding ◽

Allele Specific

ABSTRACTGenome wide association studies (GWASs) for complex traits have implicated thousands of genetic loci. Most GWAS-nominated variants lie in noncoding regions, complicating the systematic translation of these findings into functional understanding. Here, we leverage convolutional neural networks to assist in this challenge. Our computational framework, peaBrain, models the transcriptional machinery of a tissue as a two-stage process: first, predicting the mean tissue specific abundance of all genes and second, incorporating the transcriptomic consequences of genotype variation to predict individual abundance on a subject-by-subject basis. We demonstrate that peaBrain accounts for the majority (>50%) of variance observed in mean transcript abundance across most tissues and outperforms regularized linear models in predicting the consequences of individual genotype variation. We highlight the validity of the peaBrain model by calculating non-coding impact scores that correlate with nucleotide evolutionary constraint that are also predictive of disease-associated variation and allele-specific transcription factor binding. We further show how these tissue-specific peaBrain scores can be leveraged to pinpoint functional tissues underlying complex traits, outperforming methods that depend on colocalization of eQTL and GWAS signals. We subsequently derive continuous dense embeddings of genes for downstream applications, and identify putatively functional eQTLs that are missed by high-throughput experimental approaches.

Download Full-text

A transcriptome-wide Mendelian randomization study to uncover tissue-dependent regulatory mechanisms across the human phenome

10.1101/563379 ◽

2019 ◽

Cited By ~ 2

Author(s):

Tom G Richardson ◽

Gibran Hemani ◽

Tom R Gaunt ◽

Caroline L Relton ◽

George Davey Smith

Keyword(s):

Gene Expression ◽

Genetic Variants ◽

Complex Traits ◽

Mendelian Randomization ◽

Drug Repositioning ◽

Association Studies ◽

Thyroid Tissue ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Genome Wide

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.

Download Full-text

Integrative Tissue-Specific Functional Annotations in the Human Genome Provide Novel Insights on Many Complex Traits and Improve Signal Prioritization in Genome Wide Association Studies

PLoS Genetics ◽

10.1371/journal.pgen.1005947 ◽

2016 ◽

Vol 12 (4) ◽

pp. e1005947 ◽

Cited By ~ 56

Author(s):

Qiongshi Lu ◽

Ryan Lee Powles ◽

Qian Wang ◽

Beixin Julie He ◽

Hongyu Zhao

Keyword(s):

Human Genome ◽

Complex Traits ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Functional Annotations ◽

Genome Wide

Download Full-text

Estimating heritability and its enrichment in tissue-specific gene sets in admixed populations

10.1101/503144 ◽

2018 ◽

Cited By ~ 3

Author(s):

Yang Luo ◽

Xinyi Li ◽

Xin Wang ◽

Steven Gazal ◽

Josep Maria Mercader ◽

...

Keyword(s):

African American ◽

Complex Traits ◽

Association Studies ◽

Genetic Data ◽

Age At Menarche ◽

Specific Gene ◽

Genome Wide Association Studies ◽

Diverse Populations ◽

Tissue Specific ◽

Different Populations

AbstractThe increasing size and diversity of genome-wide association studies provide an exciting opportunity to study how the genetics of complex traits vary among diverse populations. Here, we introduce covariate-adjusted LD score regression (cov-LDSC), a method to accurately estimate genetic heritability and its enrichment in both homogenous and admixed populations with summary statistics and in-sample LD estimates. In-sample LD can be estimated from a subset of the GWAS samples, allowing our method to be applied efficiently to very large cohorts. In simulations, we show that unadjusted LDSC underestimates by 10% − 60% in admixed populations; in contrast, cov-LDSC is robust to all simulation parameters. We apply cov-LDSC to genotyping data from approximately 170,000 Latino, 47,000 African American and 135,000 European individuals. We estimate and detect heritability enrichment in three quantitative and five dichotomous phenotypes respectively, making this, to our knowledge, the most comprehensive heritability-based analysis of admixed individuals. Our results show that most traits have high concordance of and consistent tissue-specific heritability enrichment among different populations. However, for age at menarche, we observe population-specific heritability estimates of . We observe consistent patterns of tissue-specific heritability enrichment across populations; for example, in the limbic system for BMI, the per-standardized-annotation effect size τ* is 0.16 ± 0.04, 0.28 ± 0.11 and 0.18 ± 0.03 in Latino, African American and European populations respectively. Our results demonstrate that our approach is a powerful way to analyze genetic data for complex traits from underrepresented populations.Author summaryAdmixed populations such as African Americans and Hispanic Americans bear a disproportionately high burden of disease but remain underrepresented in current genetic studies. It is important to extend current methodological advancements for understanding the genetic basis of complex traits in homogeneous populations to individuals with admixed genetic backgrounds. Here, we develop a computationally efficient method to answer two specific questions. First, does genetic variation contribute to the same amount of phenotypic variation (heritability) across diverse populations? Second, are the genetic mechanisms shared among different populations? To answer these questions, we use our novel method to conduct the first comprehensive heritability-based analysis of a large number of admixed individuals. We show that there is a high degree of concordance in total heritability and tissue-specific enrichment between different ancestral groups. However, traits such as age at menarche show a noticeable differences among populations. Our work provides a powerful way to analyze genetic data in admixed populations and may contribute to the applicability of genomic medicine to admixed population groups.

Download Full-text

Scalable unified framework of total and allele-specific counts for cis-QTL, fine-mapping, and prediction

10.1101/2020.04.22.050666 ◽

2020 ◽

Author(s):

Yanyu Liang ◽

François Aguet ◽

Alvaro Barbeira ◽

Kristin Ardlie ◽

Hae Kyung Im

Keyword(s):

Gene Expression ◽

Fine Mapping ◽

Complex Traits ◽

Association Studies ◽

Average Power ◽

Specific Gene ◽

Genome Wide Association Studies ◽

Unified Framework ◽

Causal Genes ◽

Allele Specific

AbstractGenome-wide association studies (GWAS) have been highly successful in identifying genomic loci associated with complex traits. However, identification of the causal genes that mediate these associations remains challenging, and many approaches integrating transcriptomic data with GWAS have been proposed. However, there currently exist no computationally scalable methods that integrate total and allele-specific gene expression to maximize power to detect genetic effects on gene expression. Here, we describe a unified framework that is scalable to studies with thousands of samples. Using simulations and data from GTEx, we demonstrate an average power gain equivalent to a 29% increase in sample size for genes with sufficient allele-specific read coverage. We provide a suite of freely available tools, mixQTL, mixFine, and mixPred, that apply this framework for mapping of quantitative trait loci, fine-mapping, and prediction.

Download Full-text

Pig genome functional annotation enhances biological interpretations of complex traits and comparative epigenomics

10.21203/rs.3.rs-253276/v1 ◽

2021 ◽

Author(s):

Huaijun Zhou ◽

Zhangyuan Pan ◽

Yuelin Yao ◽

Hongwei Ying ◽

Zexi Cai ◽

...

Keyword(s):

Adaptive Evolution ◽

Complex Traits ◽

Functional Annotation ◽

Molecular Mechanisms ◽

Association Studies ◽

Regulatory Elements ◽

Neutral Evolution ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Slow Evolution

Abstract The functional annotation of livestock genomes is crucial for understanding the molecular mechanisms that underpin complex traits of economic importance, adaptive evolution and comparative genomics. Here, we provide the most comprehensive catalogue to date of regulatory elements in the pig (Sus scrofa) by integrating 223 epigenomic and transcriptomic data sets, representing 14 biologically important tissues. We systematically describe the dynamic epigenetic landscape across tissues by functionally annotating 15 different chromatin states and defining their tissue-specific regulatory activities. We demonstrate that genomic variants associated with complex traits and adaptive evolution in pig are significantly enriched in active promoters and enhancers. Furthermore, we reveal distinct tissue-specific regulatory selection between Asian and European pig domestication processes. Compared with human and mouse epigenomes, we show that porcine regulatory elements are more conserved in DNA sequence, under both rapid and slow evolution, than those under neutral evolution across pig, mouse, and human. Finally, we provide novel biological insights on tissue-specific regulatory conservation and demonstrate that, depending on the traits, mouse or pig might be more appropriate biomedical models for different complex traits and diseases in humans through integrating comparative epigenomes with 47 human genome-wide association studies.

Download Full-text

Leveraging allele-specific expression to refine fine-mapping for eQTL studies

10.1101/257279 ◽

2018 ◽

Cited By ~ 2

Author(s):

Jennifer Zou ◽

Farhad Hormozdiari ◽

Brandon Jew ◽

Jason Ernst ◽

Jae Hoon Sul ◽

...

Keyword(s):

Gene Expression ◽

Fine Mapping ◽

Complex Traits ◽

Association Studies ◽

Reduction Rate ◽

Genome Wide Association Studies ◽

Specific Expression ◽

Allele Specific Expression ◽

Allele Specific ◽

Causal Variants

AbstractMany disease risk loci identified in genome-wide association studies are present in non-coding regions of the genome. It is hypothesized that these variants affect complex traits by acting as expression quantitative trait loci (eQTLs) that influence expression of nearby genes. This indicates that many causal variants for complex traits are likely to be causal variants for gene expression. Hence, identifying causal variants for gene expression is important for elucidating the genetic basis of not only gene expression but also complex traits. However, detecting causal variants is challenging due to complex genetic correlation among variants known as linkage disequilibrium (LD) and the presence of multiple causal variants within a locus. Although several fine-mapping approaches have been developed to overcome these challenges, they may produce large sets of putative causal variants when true causal variants are in high LD with many non-causal variants. In eQTL studies, there is an additional source of information that can be used to improve fine-mapping called allele-specific expression (ASE) that measures imbalance in gene expression due to different alleles. In this work, we develop a novel statistical method that leverages both ASE and eQTL information to detect causal variants that regulate gene expression. We illustrate through simulations and application to the Genotype-Tissue Expression (GTEx) dataset that our method identifies the true causal variants with higher specificity than an approach that uses only eQTL information. In the GTEx dataset, our method achieves the median reduction rate of 11% in the number of putative causal [email protected], [email protected]

Download Full-text

Pig genome functional annotation enhances the biological interpretation of complex traits and human disease

Nature Communications ◽

10.1038/s41467-021-26153-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Zhangyuan Pan ◽

Yuelin Yao ◽

Hongwei Yin ◽

Zexi Cai ◽

Ying Wang ◽

...

Keyword(s):

Adaptive Evolution ◽

Complex Traits ◽

Functional Annotation ◽

Molecular Mechanisms ◽

Association Studies ◽

Regulatory Elements ◽

Neutral Evolution ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Slow Evolution

AbstractThe functional annotation of livestock genomes is crucial for understanding the molecular mechanisms that underpin complex traits of economic importance, adaptive evolution and comparative genomics. Here, we provide the most comprehensive catalogue to date of regulatory elements in the pig (Sus scrofa) by integrating 223 epigenomic and transcriptomic data sets, representing 14 biologically important tissues. We systematically describe the dynamic epigenetic landscape across tissues by functionally annotating 15 different chromatin states and defining their tissue-specific regulatory activities. We demonstrate that genomic variants associated with complex traits and adaptive evolution in pig are significantly enriched in active promoters and enhancers. Furthermore, we reveal distinct tissue-specific regulatory selection between Asian and European pig domestication processes. Compared with human and mouse epigenomes, we show that porcine regulatory elements are more conserved in DNA sequence, under both rapid and slow evolution, than those under neutral evolution across pig, mouse, and human. Finally, we provide biological insights on tissue-specific regulatory conservation, and by integrating 47 human genome-wide association studies, we demonstrate that, depending on the traits, mouse or pig might be more appropriate biomedical models for different complex traits and diseases.

Download Full-text

Integrative tissue-specific functional annotations in the human genome provide novel insights on many complex traits and improve signal prioritization in genome wide association studies

10.1101/028464 ◽

2015 ◽

Author(s):

Qiongshi Lu ◽

Ryan Lee Powles ◽

Qian Wang ◽

Beixin Julie He ◽

Hongyu Zhao

Keyword(s):

Complex Traits ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Tissue Specific ◽

Functional Annotations ◽

Functional Regions ◽

Genome Wide ◽

Multiple Resolutions ◽

Artery Disease

Extensive efforts have been made to understand genomic function through both experimental and computational approaches, yet proper annotation still remains challenging, especially in non-coding regions. In this manuscript, we introduce GenoSkyline, an unsupervised learning framework to predict tissue-specific functional regions through integrating high-throughput epigenetic annotations. GenoSkyline successfully identified a variety of non-coding regulatory machinery including enhancers, regulatory miRNA, and hypomethylated transposable elements in extensive case studies. Integrative analysis of GenoSkyline annotations and results from genome-wide association studies (GWAS) led to novel biological insights on the etiologies of a number of human complex traits. We also explored using tissue-specific functional annotations to prioritize GWAS signals and predict relevant tissue types for each risk locus. Brain and blood-specific annotations led to better prioritization performance for schizophrenia than standard GWAS p-values and non-tissue-specific annotations. As for coronary artery disease, heart-specific functional regions was highly enriched of GWAS signals, but previously identified risk loci were found to be most functional in other tissues, suggesting a substantial proportion of still undetected heart-related loci. In summary, GenoSkyline annotations can guide genetic studies at multiple resolutions and provide valuable insights in understanding complex diseases. GenoSkyline is available at http://genocanyon.med.yale.edu/GenoSkyline.

Download Full-text

Faculty Opinions recommendation of Estimation of complex effect-size distributions using summary-level statistics from genome-wide association studies across 32 complex traits.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.733803377.793550136 ◽

2018 ◽

Author(s):

Mohan Liu

Keyword(s):

Effect Size ◽

Complex Traits ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Size Distributions ◽

Complex Effect ◽

Genome Wide ◽

Level Statistics

Download Full-text

Family-based gene-environment interaction using sequence kernel association test (FGE-SKAT) for complex quantitative traits

Scientific Reports ◽

10.1038/s41598-021-86871-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Chao-Yu Guo ◽

Reng-Hong Wang ◽

Hsin-Chou Yang

Keyword(s):

Complex Traits ◽

Association Studies ◽

Association Test ◽

Whole Genome Sequence ◽

Environment Interaction ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Sequence Kernel Association Test ◽

Gene Environment ◽

Family Based

AbstractAfter the genome-wide association studies (GWAS) era, whole-genome sequencing is highly engaged in identifying the association of complex traits with rare variations. A score-based variance-component test has been proposed to identify common and rare genetic variants associated with complex traits while quickly adjusting for covariates. Such kernel score statistic allows for familial dependencies and adjusts for random confounding effects. However, the etiology of complex traits may involve the effects of genetic and environmental factors and the complex interactions between genes and the environment. Therefore, in this research, a novel method is proposed to detect gene and gene-environment interactions in a complex family-based association study with various correlated structures. We also developed an R function for the Fast Gene-Environment Sequence Kernel Association Test (FGE-SKAT), which is freely available as supplementary material for easy GWAS implementation to unveil such family-based joint effects. Simulation studies confirmed the validity of the new strategy and the superior statistical power. The FGE-SKAT was applied to the whole genome sequence data provided by Genetic Analysis Workshop 18 (GAW18) and discovered concordant and discordant regions compared to the methods without considering gene by environment interactions.

Download Full-text