scholarly journals Functional disease architectures reveal unique biological role of transposable elements

2018 ◽  
Author(s):  
Farhad Hormozdiari ◽  
Bryce van de Geijn ◽  
Joseph Nasser ◽  
Omer Weissbrod ◽  
Steven Gazal ◽  
...  

AbstractTransposable elements (TE) comprise roughly half of the human genome. Though initially derided as “junk DNA”, they have been widely hypothesized to contribute to the evolution of gene regulation. However, the contribution of TE to the genetic architecture of diseases and complex traits remains unknown. Here, we analyze data from 41 independent diseases and complex traits (average N=320K) to draw three main conclusions. First, TE are uniquely informative for disease heritability. Despite overall depletion for heritability (54% of SNPs, 39±2% of heritability; enrichment of 0.72±0.03; 0.38-1.23 enrichment across four main TE classes), TE explain substantially more heritability than expected based on their depletion for known functional annotations (expected enrichment of 0.35±0.03; 2.11x ratio of true vs. expected enrichment). This implies that TE acquire function in ways that differ from known functional annotations. Second, older TE contribute more to disease heritability, consistent with acquiring biological function; SNPs inside the oldest 20% of TE explain 2.45x more heritability than SNPs inside the youngest 20% of TE. Third, Short Interspersed Nuclear Elements (SINE; one of the four main TE classes) are far more enriched for blood traits (2.05±0.30) than for other traits (0.96±0.09); this difference is far greater than expected based on the weaker depletion of SINEs for regulatory annotations in blood compared to other tissues. Our results elucidate the biological roles that TE play in the genetic architecture of diseases and complex traits.

2021 ◽  
Author(s):  
Laura L Colbran ◽  
Maya R Johnson ◽  
Iain Mathieson ◽  
John A Capra

As humans spread throughout the world, they adapted to variation in many environmental factors, including climate, diet, and pathogens. Because many of these adaptations were likely mediated by multiple non-coding variants with small effects on gene regulation, it has been difficult to link genomic signals of selection to specific genes, and to describe the regulatory response to selection. To overcome this challenge, we adapted PrediXcan, a machine learning method for imputing gene regulation from genotype data, to analyze low-coverage ancient human DNA (aDNA). First, we used simulated genomes to benchmark strategies for adapting gene regulatory prediction to increase robustness to incomplete aDNA data. Applying the resulting models to 490 ancient Eurasians, we found that genes with the strongest divergent regulation among ancient populations with hunter-gatherer, pastoralist, and agricultural lifestyles are enriched for metabolic and immune functions. Next, we explored the contribution of divergent gene regulation to two traits with strong evidence of recent adaptation: dietary metabolism and skin pigmentation. We found enrichment for divergent regulation among genes previously proposed to be involved in diet-related local adaptation, and in many cases, the predicted effects on regulation provide explanations for previously observed signals of selection, e.g., at FADS1, GPX1, and LEPR. For skin pigmentation, we applied new models trained in melanocytes to a time series of 2999 ancient Europeans spanning ~38,000 years BP. In contrast to diet, skin pigmentation genes show little regulatory change over time, suggesting that adaptation mainly involved large-effect coding variants. This work demonstrates how aDNA can be combined with present-day genomes to shed light on the biological differences among ancient populations, the role of gene regulation in adaptation, and the relationship between ancient genetic diversity and the present-day distribution of complex traits.


2021 ◽  
Author(s):  
Wenmin Zhang ◽  
Hamed S Najafabadi ◽  
Yue Li

Identifying causal variants from genome-wide association studies (GWASs) is challenging due to widespread linkage disequilibrium (LD). Functional annotations of the genome may help prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. However, classical fine-mapping methods have a high computational cost, particularly when the underlying genetic architecture and LD patterns are complex. Here, we propose a novel approach, SparsePro, to efficiently conduct functionally informed statistical fine-mapping. Our method enjoys two major innovations: First, by creating a sparse low-dimensional projection of the high-dimensional genotype, we enable a linear search of causal variants instead of an exponential search of causal configurations used in existing methods; Second, we adopt a probabilistic framework with a highly efficient variational expectation-maximization algorithm to integrate statistical associations and functional priors. We evaluate SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved more accurate and well-calibrated posterior inference with greatly reduced computation time. We demonstrate the utility of SparsePro by investigating the genetic architecture of five functional biomarkers of vital organs. We identify potential causal variants contributing to the genetically encoded coordination mechanisms between vital organs and pinpoint target genes with potential pleiotropic effects. In summary, we have developed an efficient genome-wide fine-mapping method with the ability to integrate functional annotations. Our method may have wide utility in understanding the genetics of complex traits as well as in increasing the yield of functional follow-up studies of GWASs.


2017 ◽  
Vol 96 (11) ◽  
pp. 1238-1247 ◽  
Author(s):  
F. Thieme ◽  
K.U. Ludwig

In the past decade, medical genetic research has generated multiple discoveries, many of which were obtained via genome-wide association studies (GWASs). A major GWAS finding is that the majority of risk variants for complex traits map to noncoding regions. This has resulted in a paradigm shift in terms of the interpretation of human genomic sequence variation, with more attention now being paid to what was previously termed “junk DNA.” Translation of genetic findings into biologically meaningful results requires 1) large-scale and cell-specific efforts to annotate non-protein–coding regions and 2) the integration of comprehensive genomic data sets. However, this represents an enormous challenge, particularly in the case of human traits that arise during embryonic development, such as orofacial clefts (OFCs). OFC is a multifactorial trait and ranks among the most common of all human congenital malformations. These 2 attributes apply in particular to its isolated forms (nonsyndromic OFC [nsOFC]). Although genetic studies (including GWASs) have yielded novel insights into the genetic architecture of nsOFC, few data are available concerning causality and affected biological pathways. Reasons for this deficiency include the complex genetic architecture at risk loci and the limited availability of functional data sets from human tissues that represent relevant embryonic sites and time points. The present review summarizes current knowledge of the role of noncoding regions in nsOFC etiology. We describe the identification of genetic risk factors for nsOFC and several of the approaches used to identify causal variants at these loci. These strategies include the use of biological and genetic information from public databases, the assessment of the full spectrum of genetic variability within 1 locus, and comprehensive in vitro and in vivo experiments. This review also highlights the role of the emerging research field “functional genomics” and its increasing contribution to our biological understanding of nsOFC.


Author(s):  
Arsala Ali ◽  
Kyudong Han ◽  
Ping Liang

Transposable elements (TEs), also known as mobile elements (MEs), are interspersed repeats that constitute a major fraction of the genomes of higher organisms. As one of their important functional impacts on gene function and genome evolution, TEs participate in regulating the expression of genes nearby and even far away at transcriptional and post-transcriptional levels. There are two principal ways by which TEs regulate expression of genes in the human genome. First, TEs provide cis-regulatory sequences in the genome. TEs’ intrinsic regulatory properties for their own expression make them potential factors for regulating the expression of the host genes. TE-derived cis-regulatory sites are found in promoter and enhancer elements, providing binding sites for a wide range of trans-acting factors. Second, TEs encode for regulatory RNAs. TEs sequences have been revealed to be present in a substantial fraction of miRNAs and long non-coding RNAs (lncRNAs), indicating their TE origin. Furthermore, TEs sequences were found to be critical for regulatory functions of these RNAs including binding to the target mRNA. TEs thus provide crucial regulatory roles by being part of cis-regulatory and regulatory RNA sequences. Moreover, both TE-derived cis-regulatory sequences and TE-derived regulatory RNAs, have been implicated to provide evolutionary novelty to gene regulation. These TE-derived regulatory mechanisms also tend to function in tissue-specific fashion. In this review, we aim to comprehensively cover the studies regarding these two aspects of TE-mediated gene regulation, mainly focusing on the mechanisms, contribution of different types of TEs, differential roles among tissue types, and lineage specificity, based on data mostly in humans.


PLoS Biology ◽  
2010 ◽  
Vol 8 (7) ◽  
pp. e1000414 ◽  
Author(s):  
Alexander M. Tsankov ◽  
Dawn Anne Thompson ◽  
Amanda Socha ◽  
Aviv Regev ◽  
Oliver J. Rando

Sign in / Sign up

Export Citation Format

Share Document