Computational Assessment of the Regulation-Modulating Potential for Noncoding Variants

Mapping Intimacies ◽

10.1101/819409 ◽

2019 ◽

Author(s):

Fang-Yuan Shi ◽

Yu Wang ◽

Dong Huang ◽

Yu Liang ◽

Nan Liang ◽

...

Keyword(s):

Gene Expression ◽

Large Scale ◽

Genetic Diseases ◽

Superior Performance ◽

Massive Datasets ◽

Functional Variants ◽

False Discovery ◽

Genome Wide ◽

Causal Variants ◽

Higher Sensitivity

AbstractLarge-scale genome-wide association and expression quantitative trait loci studies have identified multiple noncoding variants associated with genetic diseases via affecting gene expression. However, effectively and efficiently pinpointing causal variants remains a serious challenge. Here, we developed CARMEN, a novel algorithm to identify functional noncoding expression-modulating variants. Multiple evaluations demonstrated CARMEN’s superior performance over state-of-the-art tools. Its higher sensitivity and low false discovery rate enable CARMEN to identify multiple causal expression-modulating variants that other tools simply missed. Meanwhile, benefitting from extensive annotations generated, CARMEN provides mechanism hints on predicted expression-modulating variants, enabling effectively characterizing functional variants involved in gene expression and disease-related phenotypes. CARMEN scales well with the massive datasets and is available online as a Web server at http://carmen.gao-lab.org.

Download Full-text

Local genetic effects on gene expression across 44 human tissues

10.1101/074450 ◽

2016 ◽

Cited By ~ 21

Author(s):

François Aguet ◽

Andrew A. Brown ◽

Stephane E. Castel ◽

Joe R. Davis ◽

Pejman Mohammadi ◽

...

Keyword(s):

Gene Expression ◽

Complex Traits ◽

Large Scale ◽

Genetic Diseases ◽

Tissue Expression ◽

Specific Expression ◽

Tissue Specific ◽

Regulatory Variation ◽

Functional Variants ◽

Trait Locus

AbstractExpression quantitative trait locus (eQTL) mapping provides a powerful means to identify functional variants influencing gene expression and disease pathogenesis. We report the identification of cis-eQTLs from 7,051 post-mortem samples representing 44 tissues and 449 individuals as part of the Genotype-Tissue Expression (GTEx) project. We find a cis-eQTL for 88% of all annotated protein-coding genes, with one-third having multiple independent effects. We identify numerous tissue-specific cis-eQTLs, highlighting the unique functional impact of regulatory variation in diverse tissues. By integrating large-scale functional genomics data and state-of-the-art fine-mapping algorithms, we identify multiple features predictive of tissue-specific and shared regulatory effects. We improve estimates of cis-eQTL sharing and effect sizes using allele specific expression across tissues. Finally, we demonstrate the utility of this large compendium of cis-eQTLs for understanding the tissue-specific etiology of complex traits, including coronary artery disease. The GTEx project provides an exceptional resource that has improved our understanding of gene regulation across tissues and the role of regulatory variation in human genetic diseases.

Download Full-text

Genome-wide fine-mapping identifies pleiotropic and functional variants that predict many traits across global cattle populations

Nature Communications ◽

10.1038/s41467-021-21001-0 ◽

2021 ◽

Vol 12 (1) ◽

Cited By ~ 1

Author(s):

Ruidong Xiang ◽

Iona M. MacLeod ◽

Hans D. Daetwyler ◽

Gerben de Jong ◽

Erin O’Connor ◽

...

Keyword(s):

Dairy Cattle ◽

Chromosome Segment ◽

Multiple Traits ◽

Functional Variants ◽

Genome Wide ◽

The Usa ◽

Pleiotropic Qtl ◽

Causal Variants ◽

Genetic Value ◽

Bayesian Mixture Models

AbstractThe difficulty in finding causative mutations has hampered their use in genomic prediction. Here, we present a methodology to fine-map potentially causal variants genome-wide by integrating the functional, evolutionary and pleiotropic information of variants using GWAS, variant clustering and Bayesian mixture models. Our analysis of 17 million sequence variants in 44,000+ Australian dairy cattle for 34 traits suggests, on average, one pleiotropic QTL existing in each 50 kb chromosome-segment. We selected a set of 80k variants representing potentially causal variants within each chromosome segment to develop a bovine XT-50K genotyping array. The custom array contains many pleiotropic variants with biological functions, including splicing QTLs and variants at conserved sites across 100 vertebrate species. This biology-informed custom array outperformed the standard array in predicting genetic value of multiple traits across populations in independent datasets of 90,000+ dairy cattle from the USA, Australia and New Zealand.

Download Full-text

HCR-FlowFISH: A flexible CRISPR screening method to identify cis-regulatory elements and their target genes

10.1101/2020.05.11.078675 ◽

2020 ◽

Author(s):

SK Reilly ◽

SJ Gosai ◽

A Gutierrez ◽

JC Ulirsch ◽

M Kanai ◽

...

Keyword(s):

Gene Expression ◽

Target Genes ◽

Screening Method ◽

Cell Types ◽

Regulatory Elements ◽

Hybridization Chain Reaction ◽

Genome Wide ◽

Wide Range ◽

Causal Variants ◽

Endogenous Loci

AbstractCRISPR screens for cis-regulatory elements (CREs) have shown unprecedented power to endogenously characterize the non-coding genome. To characterize CREs we developed HCR-FlowFISH (Hybridization Chain Reaction Fluorescent In-Situ Hybridization coupled with Flow Cytometry), which directly quantifies native transcripts within their endogenous loci following CRISPR perturbations of regulatory elements, eliminating the need for restrictive phenotypic assays such as growth or transcript-tagging. HCR-FlowFISH accurately quantifies gene expression across a wide range of transcript levels and cell types. We also developed CASA (CRISPR Activity Screen Analysis), a hierarchical Bayesian model to identify and quantify CRE activity. Using >270,000 perturbations, we identified CREs for GATA1, HDAC6, ERP29, LMO2, MEF2C, CD164, NMU, FEN1 and the FADS gene cluster. Our methods detect subtle gene expression changes and identify CREs regulating multiple genes, sometimes at different magnitudes and directions. We demonstrate the power of HCR-FlowFISH to parse genome-wide association signals by nominating causal variants and target genes.

Download Full-text

Meta-analysis of 208370 East Asians identifies 113 susceptibility loci for systemic lupus erythematosus

Annals of the Rheumatic Diseases ◽

10.1136/annrheumdis-2020-219209 ◽

2020 ◽

pp. annrheumdis-2020-219209

Author(s):

Xianyong Yin ◽

Kwangwoo Kim ◽

Hiroyuki Suetsugu ◽

So-Young Bang ◽

Leilei Wen ◽

...

Keyword(s):

Systemic Lupus Erythematosus ◽

Lupus Erythematosus ◽

Large Scale ◽

Meta Analysis ◽

Genetic Correlations ◽

Susceptibility Loci ◽

Systemic Lupus ◽

East Asians ◽

Genome Wide ◽

Causal Variants

ObjectiveSystemic lupus erythematosus (SLE), an autoimmune disorder, has been associated with nearly 100 susceptibility loci. Nevertheless, these loci only partially explain SLE heritability and their putative causal variants are rarely prioritised, which make challenging to elucidate disease biology. To detect new SLE loci and causal variants, we performed the largest genome-wide meta-analysis for SLE in East Asian populations.MethodsWe newly genotyped 10 029 SLE cases and 180 167 controls and subsequently meta-analysed them jointly with 3348 SLE cases and 14 826 controls from published studies in East Asians. We further applied a Bayesian statistical approach to localise the putative causal variants for SLE associations.ResultsWe identified 113 genetic regions including 46 novel loci at genome-wide significance (p<5×10−8). Conditional analysis detected 233 association signals within these loci, which suggest widespread allelic heterogeneity. We detected genome-wide associations at six new missense variants. Bayesian statistical fine-mapping analysis prioritised the putative causal variants to a small set of variants (95% credible set size ≤10) for 28 association signals. We identified 110 putative causal variants with posterior probabilities ≥0.1 for 57 SLE loci, among which we prioritised 10 most likely putative causal variants (posterior probability ≥0.8). Linkage disequilibrium score regression detected genetic correlations for SLE with albumin/globulin ratio (rg=−0.242) and non-albumin protein (rg=0.238).ConclusionThis study reiterates the power of large-scale genome-wide meta-analysis for novel genetic discovery. These findings shed light on genetic and biological understandings of SLE.

Download Full-text

Genetics of juvenile rheumatic diseases

10.1093/med/9780199642489.003.0043_update_002 ◽

2015 ◽

Author(s):

Anne Hinks ◽

Wendy Thomson

Keyword(s):

Risk Factors ◽

Rheumatic Diseases ◽

Large Scale ◽

Association Studies ◽

Genetic Diseases ◽

Response To Treatment ◽

Genome Wide Association Studies ◽

Established Risk Factor ◽

Genome Wide ◽

Juvenile Rheumatic Diseases

Juvenile rheumatic diseases are heterogeneous, complex genetic diseases; to date only juvenile idiopathic arthritis (JIA) has been extensively studied in terms of identifying genetic risk factors. The MHC region is a well-established risk factor but in the last few years candidate gene and large-scale genome-wide association studies have been utilized in the search for non-HLA risk factors. There are now 17 JIA susceptibility loci which reach the genome-wide significance threshold for association and a further 7 regions with evidence for association in more than one study. In addition, some subtype-specific associations are emerging. These risk loci now need to be investigated further using fine-mapping strategies and then appropriate functional studies to show how the variant alters the gene function. This knowledge will not only lead to a better understanding of disease pathogenesis for juvenile rheumatic diseases but may also aid in the classification of these heterogeneous diseases. It may identify new pathways for potential therapeutic targets and help in the prediction of disease outcome and response to treatment.

Download Full-text

Chromatin compartment dynamics in a haploinsufficient model of cardiac laminopathy

10.1101/555250 ◽

2019 ◽

Cited By ~ 1

Author(s):

Alessandro Bertero ◽

Paul A. Fields ◽

Alec S. T. Smith ◽

Andrea Leonard ◽

Kevin Beussman ◽

...

Keyword(s):

Gene Expression ◽

Large Scale ◽

Induced Pluripotent Stem Cell ◽

Lamin A ◽

Chromosome Conformation ◽

Human Cardiomyocytes ◽

Nuclear Lamins ◽

Genome Wide ◽

Induced Pluripotent ◽

Gene Expression Alterations

AbstractPathogenic mutations in A-type nuclear lamins cause dilated cardiomyopathy, which is postulated to result from dysregulated gene expression due to changes in chromatin organization into active and inactive compartments. To test this, we performed genome-wide chromosome conformation analyses (Hi-C) in human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) with a haploinsufficient mutation for lamin A/C. Compared to gene-corrected cells, mutant hiPSC-CMs have marked electrophysiological and contractile alterations, with modest gene expression changes. While large-scale changes in chromosomal topology are evident, differences in chromatin compartmentalization are limited to a few hotspots that escape inactivation during cardiogenesis. These regions exhibit upregulation of multiple non-cardiac genes including CACNA1A, encoding for neuronal P/Q-type calcium channels. Pharmacological inhibition of the resulting current partially mitigates the electrical alterations. On the other hand, A/B compartment changes do not explain most gene expression alterations in mutant hiPSC-CMs. We conclude that global errors in chromosomal compartmentation are not the primary pathogenic mechanism in heart failure due to lamin A/C haploinsufficiency.SummaryBertero et al. observe that lamin A/C haploinsufficiency in human cardiomyocytes markedly alters electrophysiology, contractility, gene expression, and chromosomal topology. Contrary to expectations, however, changes in chromatin compartments involve just few regions, and most dysregulated genes lie outside these hotspots.Condensed titleGenomic effects of lamin A/C haploinsufficiency

Download Full-text

Gene regulatory effects of a large chromosomal inversion in highland maize

10.1101/861583 ◽

2019 ◽

Cited By ~ 1

Author(s):

Taylor Crow ◽

James Ta ◽

Saghi Nojoomi ◽

M. Rocío Aguilar-Rangel ◽

Jorge Vladimir Torres Rodríguez ◽

...

Keyword(s):

Gene Expression ◽

Zea Mays ◽

Large Scale ◽

Chromosomal Inversion ◽

Experimental Conditions ◽

Chromosomal Inversions ◽

Functional Variants ◽

Original Manuscript ◽

Locally Adaptive ◽

Highland Maize

AbstractChromosomal inversions play an important role in local adaptation. Inversions can capture multiple locally adaptive functional variants in a linked block by repressing recombination. However, this recombination suppression makes it difficult to identify the genetic mechanisms that underlie an inversion’s role in adaption. In this study, we explore how large-scale transcriptomic data can be used to dissect the functional importance of a 13 Mb inversion locus (Inv4m) found almost exclusively in highland populations of maize (Zea mays ssp. mays). Inv4m introgressed into highland maize from the wild relative Zea mays ssp. mexicana, also present in the highlands of Mexico, and is thought to be important for the adaptation of these populations to cultivation in highland environments. First, using a large publicly available association mapping panel, we confirmed that Inv4m is associated with locally adaptive agronomic phenotypes, but only in highland fields. Second, we created two families segregating for standard and inverted haplotypess of Inv4m in a isogenic B73 background, and measured gene expression variation association with Inv4m across 9 tissues in two experimental conditions. With these data, we quantified both the global transcriptomic effects of the highland Inv4m haplotype, and the local cis-regulatory variation present within the locus. We found diverse physiological effects of Inv4m, and speculate that the genetic basis of its effects on adaptive traits is distributed across many separate functional variants.Author SummaryChromosomal inversions are an important type of genomic structural variant. However, mapping causal alleles within their boundaries is difficult because inversions suppress recombination between homologous chromosomes. This means that inversions, regardless of their size, are inherited as a unit. We leveraged the high-dimensional phenotype of gene expression as a tool to study the genetics of a large chromosomal inversion found in highland maize populations in Mexico - Inv4m. We grew plants carrying multiple versions of Inv4m in a common genetic background, and quantified the transcriptional reprogramming induced by alternative alleles at the locus. Inv4m has been shown in previous studies to have a large effect on flowering, but we show that the functional variation within Inv4m affects many developmental and physiological processes.Author ContributionsT. Crow, R. Rellan-Alvarez, R. Sawers and D. Runcie conceived and designed the experiment. M. Aguilar-Rangel, J. Rodrǵuez, R. Rellan-Alvarez and R. Sawers generated the segregating families. T. Crow, J. Ta, S. Nojoomi, M. Aguilar-Rangel, J. Rodrǵuez D. Gates, D. Runcie performed the experiment. T. Crow, D. Gates, D. Runcie analyzed the data. T. Crow, D. Runcie wrote the original manuscript, and R. Rellan-Alvarez and R. Sawers provided review and editing.

Download Full-text

Genome-wide profiling of transcribed enhancers during macrophage activation

10.1101/163519 ◽

2017 ◽

Author(s):

Elena Denisenko ◽

Reto Guler ◽

Musa Mhlanga ◽

Harukazu Suzuki ◽

Frank Brombacher ◽

...

Keyword(s):

Gene Expression ◽

Transcriptional Activation ◽

Large Scale ◽

Transcriptional Control ◽

Macrophage Activation ◽

Transcriptional Responses ◽

Protein Coding ◽

Genome Wide ◽

Ifn Γ ◽

Cap Analysis

AbstractMacrophages are sentinel cells essential for tissue homeostasis and host defence. Owing to their plasticity, macrophages acquire a range of functional phenotypes in response to microenvironmental stimuli, of which M(IFN-γ) and M(IL-4/IL-13) are well-known for their opposing pro- and anti-inflammatory roles. Enhancers have emerged as regulatory DNA elements crucial for transcriptional activation of gene expression. Using cap analysis of gene expression and epigenetic data, we identify on large-scale transcribed enhancers in mouse macrophages, their time kinetics and target protein-coding genes. We observe an increase in target gene expression, concomitant with increasing numbers of associated enhancers and find that genes associated to many enhancers show a shift towards stronger enrichment for macrophage-specific biological processes. We infer enhancers that drive transcriptional responses of genes upon M(IFN-γ) and M(IL-4/IL-13) macrophage activation and demonstrate stimuli-specificity of regulatory associations. Finally, we show that enhancer regions are enriched for binding sites of inflammation-related transcription factors, suggesting a link between stimuli response and enhancer transcriptional control. Our study provides new insights into genome-wide enhancer-mediated transcriptional control of macrophage genes, including those implicated in macrophage activation, and offers a detailed genome-wide catalogue to further elucidate enhancer regulation in macrophages.

Download Full-text

The contribution of common regulatory and protein-coding TYR variants in the genetic architecture of albinism

10.1101/2021.11.01.21265733 ◽

2021 ◽

Author(s):

Vincent Michaud ◽

Eulalie Lasseaux ◽

David J Green ◽

Dave T Gerrard ◽

Claudio Plaisant ◽

...

Keyword(s):

Genetic Architecture ◽

Large Scale ◽

Diagnostic Yield ◽

Genetic Diseases ◽

Gene Encoding ◽

Protein Coding ◽

Autosomal Recessive Disorders ◽

Functional Variants ◽

Prevalent Disease ◽

Coding Variants

Genetic diseases have been historically segregated into rare Mendelian and common complex conditions. Large-scale studies using genome sequencing are eroding this distinction and are gradually unmasking the underlying complexity of human traits. We studied a cohort of 1,313 individuals with albinism aiming to gain insights into the genetic architecture of rare, autosomal recessive disorders. We investigated the contribution of regulatory and protein-coding variants at the common and rare ends of the allele-frequency spectrum. We focused on TYR, the gene encoding tyrosinase, and found that a promoter variant, TYR: c.-301C>T [rs4547091], modulates the penetrance of a prevalent, disease-associated missense change, TYR: c.1205G>A [rs1126809]. We also found that homozygosity for a haplotype formed by three common, functional variants, TYR: c.[-301C;575C>A;1205G>A], confers a high risk of albinism (OR>77) and is associated with reduced vision in UK Biobank participants. Finally, we report how the combined analysis of rare and common variants increases diagnostic yield and informs genetic counselling in families with albinism.

Download Full-text

Non-B DNA: a major contributor to small- and large-scale variation in nucleotide substitution frequencies across the genome

Nucleic Acids Research ◽

10.1093/nar/gkaa1269 ◽

2021 ◽

Vol 49 (3) ◽

pp. 1497-1516

Author(s):

Wilfried M Guiblet ◽

Marzia A Cremona ◽

Robert S Harris ◽

Di Chen ◽

Kristin A Eckert ◽

...

Keyword(s):

Large Scale ◽

Nucleotide Substitution ◽

Genetic Diseases ◽

Nucleotide Polymorphisms ◽

Dna Structures ◽

Cellular Processes ◽

Genome Wide ◽

Dna Types ◽

Flanking Regions

Abstract Approximately 13% of the human genome can fold into non-canonical (non-B) DNA structures (e.g. G-quadruplexes, Z-DNA, etc.), which have been implicated in vital cellular processes. Non-B DNA also hinders replication, increasing errors and facilitating mutagenesis, yet its contribution to genome-wide variation in mutation rates remains unexplored. Here, we conducted a comprehensive analysis of nucleotide substitution frequencies at non-B DNA loci within noncoding, non-repetitive genome regions, their ±2 kb flanking regions, and 1-Megabase windows, using human-orangutan divergence and human single-nucleotide polymorphisms. Functional data analysis at single-base resolution demonstrated that substitution frequencies are usually elevated at non-B DNA, with patterns specific to each non-B DNA type. Mirror, direct and inverted repeats have higher substitution frequencies in spacers than in repeat arms, whereas G-quadruplexes, particularly stable ones, have higher substitution frequencies in loops than in stems. Several non-B DNA types also affect substitution frequencies in their flanking regions. Finally, non-B DNA explains more variation than any other predictor in multiple regression models for diversity or divergence at 1-Megabase scale. Thus, non-B DNA substantially contributes to variation in substitution frequencies at small and large scales. Our results highlight the role of non-B DNA in germline mutagenesis with implications to evolution and genetic diseases.

Download Full-text