scholarly journals Survey of the Heritability and Sparse Architecture of Gene Expression Traits Across Human Tissues

2016 ◽  
Author(s):  
Heather E. Wheeler ◽  
Kaanan P. Shah ◽  
Jonathon Brenner ◽  
Tzintzuni Garcia ◽  
Keston Aquino-Michaels ◽  
...  

AbstractUnderstanding the genetic architecture of gene expression traits is key to elucidating the underlying mechanisms of complex traits. Here, for the first time, we perform a systematic survey of the heritability and the distribution of effect sizes across all representative tissues in the human body. We find that local h2 can be relatively well characterized with 59% of expressed genes showing significant h2 (FDR < 0.1) in the DGN whole blood cohort. However, current sample sizes (n ≤ 922) do not allow us to compute distal h2. Bayesian Sparse Linear Mixed Model (BSLMM) analysis provides strong evidence that the genetic contribution to local expression traits is dominated by a handful of genetic variants rather than by the collective contribution of a large number of variants each of modest size. In other words, the local architecture of gene expression traits is sparse rather than polygenic across all 40 tissues (from DGN and GTEx) examined. This result is confirmed by the sparsity of optimal performing gene expression predictors via elastic net modeling. To further explore the tissue context specificity, we decompose the expression traits into cross-tissue and tissue-specific components using a novel Orthogonal Tissue Decomposition (OTD) approach. Through a series of simulations we show that the cross-tissue and tissue-specific components are identifiable via OTD. Heritability and sparsity estimates of these derived expression phenotypes show similar characteristics to the original traits. Consistent properties relative to prior GTEx multi-tissue analysis results suggest that these traits reflect the expected biology. Finally, we apply this knowledge to develop prediction models of gene expression traits for all tissues. The prediction models, heritability, and prediction performance R2 for original and decomposed expression phenotypes are made publicly available (https://github.com/hakyimlab/PrediXcan).Author SummaryGene regulation is known to contribute to the underlying mechanisms of complex traits. The GTEx project has generated RNA-Seq data on hundreds of individuals across more than 40 tissues providing a comprehensive atlas of gene expression traits. Here, we systematically examined the local versus distant heritability as well as the sparsity versus polygenicity of protein coding gene expression traits in tissues across the entire human body. To determine tissue context specificity, we decomposed the expression levels into cross-tissue and tissue-specific components. Regardless of tissue type, we found that local heritability, but not distal heritability, can be well characterized with current sample sizes. We found that the distribution of effect sizes is more consistent with a sparse local architecture in all tissues. We also show that the cross-tissue and tissue-specific expression phenotypes constructed with our orthogonal tissue decomposition model recapitulate complex Bayesian multi-tissue analysis results. This knowledge was applied to develop prediction models of gene expression traits for all tissues, which we make publicly available.

2019 ◽  
Author(s):  
Tom G Richardson ◽  
Gibran Hemani ◽  
Tom R Gaunt ◽  
Caroline L Relton ◽  
George Davey Smith

AbstractBackgroundDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. By applying the principles of Mendelian randomization, we have undertaken a systematic analysis to evaluate transcriptome-wide associations between gene expression across 48 different tissue types and 395 complex traits.ResultsOverall, we identified 100,025 gene-trait associations based on conventional genome-wide corrections (P < 5 × 10−08) that also provided evidence of genetic colocalization. These results indicated that genetic variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. We identified many examples of tissue-specific effects, such as genetically-predicted TPO, NR3C2 and SPATA13 expression only associating with thyroid disease in thyroid tissue. Additionally, FBN2 expression was associated with both cardiovascular and lung function traits, but only when analysed in heart and lung tissue respectively.We also demonstrate that conducting phenome-wide evaluations of our results can help flag adverse on-target side effects for therapeutic intervention, as well as propose drug repositioning opportunities. Moreover, we find that exploring the tissue-dependency of associations identified by genome-wide association studies (GWAS) can help elucidate the causal genes and tissues responsible for effects, as well as uncover putative novel associations.ConclusionsThe atlas of tissue-dependent associations we have constructed should prove extremely valuable to future studies investigating the genetic determinants of complex disease. The follow-up analyses we have performed in this study are merely a guide for future research. Conducting similar evaluations can be undertaken systematically at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.


2019 ◽  
Vol 28 (17) ◽  
pp. 2976-2986 ◽  
Author(s):  
Irfahan Kassam ◽  
Yang Wu ◽  
Jian Yang ◽  
Peter M Visscher ◽  
Allan F McRae

Abstract Despite extensive sex differences in human complex traits and disease, the male and female genomes differ only in the sex chromosomes. This implies that most sex-differentiated traits are the result of differences in the expression of genes that are common to both sexes. While sex differences in gene expression have been observed in a range of different tissues, the biological mechanisms for tissue-specific sex differences (TSSDs) in gene expression are not well understood. A total of 30 640 autosomal and 1021 X-linked transcripts were tested for heterogeneity in sex difference effect sizes in n = 617 individuals across 40 tissue types in Genotype–Tissue Expression (GTEx). This identified 65 autosomal and 66 X-linked TSSD transcripts (corresponding to unique genes) at a stringent significance threshold. Results for X-linked TSSD transcripts showed mainly concordant direction of sex differences across tissues and replicate previous findings. Autosomal TSSD transcripts had mainly discordant direction of sex differences across tissues. The top cis-expression quantitative trait loci (eQTLs) across tissues for autosomal TSSD transcripts are located a similar distance away from the nearest androgen and estrogen binding motifs and the nearest enhancer, as compared to cis-eQTLs for transcripts with stable sex differences in gene expression across tissue types. Enhancer regions that overlap top cis-eQTLs for TSSD transcripts, however, were found to be more dispersed across tissues. These observations suggest that androgen and estrogen regulatory elements in a cis region may play a common role in sex differences in gene expression, but TSSD in gene expression may additionally be due to causal variants located in tissue-specific enhancer regions.


2019 ◽  
Author(s):  
Anna Mikhaylova ◽  
Timothy Thornton

AbstractPredicting gene expression with genetic data has garnered significant attention in recent years. PrediXcan is one of the most widely used gene-based association methods for testing imputed gene expression values with a phenotype due to the invaluable insight the method has shown into the relationship between complex traits and the component of gene expression that can be attributed to genetic variation. The prediction models for PrediXcan, however, were obtained using supervised machine learning methods and training data from the Depression and Gene Network (DGN) and the Genotype-Tissue Expression (GTEx) data, where the majority of subjects are of European descent. Many genetic studies, however, include samples from multi-ethnic populations, and in this paper we assess the accuracy of gene expression predictions with PrediXcan in diverse populations. Using transcriptomic data from the GEUVADIS (Genetic European Variation in Health and Disease) RNA sequencing project and whole genome sequencing data from the 1000 Genomes project, we evaluate and compare the predictive performance of PrediXcan in an African population (Yoruban) and four European populations. Prediction results are obtained using a range of models from PrediXcan weight databases, and Pearson’s correlation coefficient is used to measure prediction accuracy. We demonstrate that the predictive performance of PrediXcan varies across populations (F-test p-value < 0.001), where prediction accuracy is the worst in the Yoruban sample compared to European samples. Moreover, the performance of PrediXcan varies not only among distant populations, but also among closely related populations as well. We also find that the qualitative performance of PrediXcan for the populations considered is consistent across all weight databases used.


2017 ◽  
Author(s):  
Luke J. O’Connor ◽  
Alexander Gusev ◽  
Xuanyao Liu ◽  
Po-Ru Loh ◽  
Hilary K. Finucane ◽  
...  

AbstractDisease risk variants identified by GWAS are predominantly noncoding, suggesting that gene regulation plays an important role. eQTL studies in unaffected individuals are often used to link disease-associated variants with the genes they regulate, relying on the hypothesis that noncoding regulatory effects are mediated by steady-state expression levels. To test this hypothesis, we developed a method to estimate the proportion of disease heritability mediated by the cis-genetic component of assayed gene expression levels. The method, gene expression co-score regression (GECS regression), relies on the idea that, for a gene whose expression level affects a phenotype, SNPs with similar effects on the expression of that gene will have similar phenotypic effects. In order to distinguish directional effects mediated by gene expression from non-directional pleiotropic or tagging effects, GECS regression operates on pairs of cis SNPs in linkage equilibrium, regressing pairwise products of disease effect sizes on products of cis-eQTL effect sizes. We verified that GECS regression produces robust estimates of mediated effects in simulations. We applied the method to eQTL data in 44 tissues from the GTEx consortium (average NeQTL = 158 samples) in conjunction with GWAS summary statistics for 30 diseases and complex traits (average NGWAS = 88K) with low pairwise genetic correlation, estimating the proportion of SNP-heritability mediated by the cis-genetic component of assayed gene expression in the union of the 44 tissues. The mean estimate was 0.21 (s.e. = 0.01) across 30 traits, with a significantly positive estimate (p < 0.001) for every trait. Thus, assayed gene expression in bulk tissues mediates a statistically significant but modest proportion of disease heritability, motivating the development of additional assays to capture regulatory effects and the use of our method to estimate how much disease heritability they mediate.


2021 ◽  
Author(s):  
Roshni A. Patel ◽  
Shaila A. Musharoff ◽  
Jeffrey P. Spence ◽  
Harold Pimentel ◽  
Catherine Tcheandjieu ◽  
...  

Despite the growing number of genome-wide association studies (GWAS) for complex traits, it remains unclear whether effect sizes of causal genetic variants differ between populations. In principle, effect sizes of causal variants could differ between populations due to gene-by-gene or gene-by-environment interactions. However, comparing causal variant effect sizes is challenging: it is difficult to know which variants are causal, and comparisons of variant effect sizes are confounded by differences in linkage disequilibrium (LD) structure between ancestries. Here, we develop a method to assess causal variant effect size differences that overcomes these limitations. Specifically, we leverage the fact that segments of European ancestry shared between European-American and admixed African-American individuals have similar LD structure, allowing for unbiased comparisons of variant effect sizes in European ancestry segments. We apply our method to two types of traits: gene expression and low-density lipoprotein cholesterol (LDL-C). We find that causal variant effect sizes for gene expression are significantly different between European-Americans and African-Americans; for LDL-C, we observe a similar point estimate although this is not significant, likely due to lower statistical power. Cross-population differences in variant effect sizes highlight the role of genetic interactions in trait architecture and will contribute to the poor portability of polygenic scores across populations, reinforcing the importance of conducting GWAS on individuals of diverse ancestries and environments.


2016 ◽  
Author(s):  
François Aguet ◽  
Andrew A. Brown ◽  
Stephane E. Castel ◽  
Joe R. Davis ◽  
Pejman Mohammadi ◽  
...  

AbstractExpression quantitative trait locus (eQTL) mapping provides a powerful means to identify functional variants influencing gene expression and disease pathogenesis. We report the identification of cis-eQTLs from 7,051 post-mortem samples representing 44 tissues and 449 individuals as part of the Genotype-Tissue Expression (GTEx) project. We find a cis-eQTL for 88% of all annotated protein-coding genes, with one-third having multiple independent effects. We identify numerous tissue-specific cis-eQTLs, highlighting the unique functional impact of regulatory variation in diverse tissues. By integrating large-scale functional genomics data and state-of-the-art fine-mapping algorithms, we identify multiple features predictive of tissue-specific and shared regulatory effects. We improve estimates of cis-eQTL sharing and effect sizes using allele specific expression across tissues. Finally, we demonstrate the utility of this large compendium of cis-eQTLs for understanding the tissue-specific etiology of complex traits, including coronary artery disease. The GTEx project provides an exceptional resource that has improved our understanding of gene regulation across tissues and the role of regulatory variation in human genetic diseases.


2016 ◽  
Author(s):  
Alfonso Buil ◽  
Ana Viñuela ◽  
Andrew A. Brown ◽  
Matthew N. Davies ◽  
Ismael Padioleau ◽  
...  

AbstractGene expression can provide biological mechanisms which underlie genetic associations with complex traits and diseases, but often the most relevant tissue for the trait is inaccessible and a proxy is the only alternative. Here, we investigate shared and tissue specific patterns of variability in expression in multiple tissues, to quantify the degree of sharing of causes (genetic or non-genetic) of variability in gene expression among tissues. Using gene expression in ~800 female twins from the TwinsUK cohort in skin, fat, whole blood and lymphoblastoid cell lines (LCLs), we identified 9166 significant cis-eQTLs in fat, 9551 in LCLs, 8731 in skin and 5313 in blood (1% FDR). We observed up to 80% of cis-eQTLs are shared in pairs of tissues. In addition, the cis genetic correlation between tissues is > 90% for 35% of the genes, indicating for these genes a largely tissue-shared component of cis regulation. However, variance components show that cis genetic signals explain only a small fraction of the variation in expression, with from 67–87% of the variance explained by environmental factors, and 53% of the genetic effects occurring in trans. We observe a trans genetic correlation of 0 for all genes except a few which show correlation between fat and skin expression. The environmental effects are also observed to be entirely tissue specific, despite related tissues largely sharing exposures. These results demonstrate that patterns of gene expression are largely tissue specific, strongly supporting the need to study higher order regulatory interactions in the appropriate tissue context with large samples sizes and diversity of environmental contexts.


2018 ◽  
Author(s):  
Timothy J. Cherry ◽  
Marty G. Yang ◽  
David A. Harmin ◽  
Peter Tao ◽  
Andrew E. Timms ◽  
...  

ABSTRACTCis-regulatory elements (CREs) orchestrate the dynamic and diverse transcriptional programs that assemble the human central nervous system (CNS) during development and maintain its function throughout life. Genetic variation within CREs plays a central role in phenotypic variation in complex traits including the risk of developing disease. However, the cellular complexity of the human brain has largely precluded the identification of functional regulatory variation within the human CNS. We took advantage of the retina, a well-characterized region of the CNS with reduced cellular heterogeneity, to establish a roadmap for characterizing regulatory variation in the human CNS. This comprehensive resource of tissue-specific regulatory elements, transcription factor binding, and gene expression programs in three regions of the human visual system (retina, macula, retinal pigment epithelium/choroid) reveals features of regulatory element evolution that shape tissue-specific gene expression programs and defines the regulatory elements with the potential to contribute to mendelian and complex disorders of human vision.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Tom G. Richardson ◽  
Gibran Hemani ◽  
Tom R. Gaunt ◽  
Caroline L. Relton ◽  
George Davey Smith

AbstractDeveloping insight into tissue-specific transcriptional mechanisms can help improve our understanding of how genetic variants exert their effects on complex traits and disease. In this study, we apply the principles of Mendelian randomization to systematically evaluate transcriptome-wide associations between gene expression (across 48 different tissue types) and 395 complex traits. Our findings indicate that variants which influence gene expression levels in multiple tissues are more likely to influence multiple complex traits. Moreover, detailed investigations of our results highlight tissue-specific associations, drug validation opportunities, insight into the likely causal pathways for trait-associated variants and also implicate putative associations at loci yet to be implicated in disease susceptibility. Similar evaluations can be conducted at http://mrcieu.mrsoftware.org/Tissue_MR_atlas/.


2017 ◽  
Author(s):  
Farhad Hormozdiari ◽  
Steven Gazal ◽  
Bryce van de Geijn ◽  
Hilary Finucane ◽  
Chelsea J.-T. Ju ◽  
...  

AbstractThere is increasing evidence that many GWAS risk loci are molecular QTL for gene ex-pression (eQTL), histone modification (hQTL), splicing (sQTL), and/or DNA methylation (meQTL). Here, we introduce a new set of functional annotations based on causal posterior prob-abilities (CPP) of fine-mapped molecular cis-QTL, using data from the GTEx and BLUEPRINT consortia. We show that these annotations are very strongly enriched for disease heritability across 41 independent diseases and complex traits (average N = 320K): 5.84x for GTEx eQTL, and 5.44x for eQTL, 4.27-4.28x for hQTL (H3K27ac and H3K4me1), 3.61x for sQTL and 2.81x for meQTL in BLUEPRINT (all P ≤ 1.39e-10), far higher than enrichments obtained using stan-dard functional annotations that include all significant molecular cis-QTL (1.17-1.80x). eQTL annotations that were obtained by meta-analyzing all 44 GTEx tissues generally performed best, but tissue-specific blood eQTL annotations produced stronger enrichments for autoimmune dis-eases and blood cell traits and tissue-specific brain eQTL annotations produced stronger enrich-ments for brain-related diseases and traits, despite high cis-genetic correlations of eQTL effect sizes across tissues. Notably, eQTL annotations restricted to loss-of-function intolerant genes from ExAC were even more strongly enriched for disease heritability (17.09x; vs. 5.84x for all genes; P = 4.90e-17 for difference). All molecular QTL except sQTL remained significantly enriched for disease heritability in a joint analysis conditioned on each other and on a broad set of functional annotations from previous studies, implying that each of these annotations is uniquely informative for disease and complex trait architectures.


Sign in / Sign up

Export Citation Format

Share Document