On entropy and information in gene interaction networks

Z S Wallace; S B Rosenthal; K M Fisch; T Ideker; R Sasik

doi:10.1093/bioinformatics/bty691

On entropy and information in gene interaction networks

Bioinformatics ◽

10.1093/bioinformatics/bty691 ◽

2018 ◽

Vol 35 (5) ◽

pp. 815-822 ◽

Cited By ~ 3

Author(s):

Z S Wallace ◽

S B Rosenthal ◽

K M Fisch ◽

T Ideker ◽

R Sasik

Keyword(s):

Degrees Of Freedom ◽

Association Studies ◽

Gene List ◽

Interaction Network ◽

Gene Interaction ◽

Interaction Networks ◽

Harmonic Oscillators ◽

Genome Wide Association Studies ◽

Gene Enrichment ◽

Gene Sets

Abstract Motivation Modern biological experiments often produce candidate lists of genes presumably related to the studied phenotype. One can ask if the gene list as a whole makes sense in the context of existing knowledge: Are the genes in the list reasonably related to each other or do they look like a random assembly? There are also situations when one wants to know if two or more gene sets are closely related. Gene enrichment tests based on counting the number of genes two sets have in common are adequate if we presume that two genes are related only when they are in fact identical. If by related we mean well connected in the interaction network space, we need a new measure of relatedness for gene sets. Results We derive entropy, interaction information and mutual information for gene sets on interaction networks, starting from a simple phenomenological model of a living cell. Formally, the model describes a set of interacting linear harmonic oscillators in thermal equilibrium. Because the energy function is a quadratic form of the degrees of freedom, entropy and all other derived information quantities can be calculated exactly. We apply these concepts to estimate the probability that genes from several independent genome-wide association studies are not mutually informative; to estimate the probability that two disjoint canonical metabolic pathways are not mutually informative; and to infer relationships among human diseases based on their gene signatures. We show that the present approach is able to predict observationally validated relationships not detectable by gene enrichment methods. The converse is also true; the two methods are therefore complementary. Availability and implementation The functions defined in this paper are available in an R package, gsia, available for download at https://github.com/ucsd-ccbb/gsia.

Download Full-text

Comorbidities and Susceptibility to COVID-19: A Generalized Gene Set Data Mining Approach

Journal of Clinical Medicine ◽

10.3390/jcm10081666 ◽

2021 ◽

Vol 10 (8) ◽

pp. 1666

Author(s):

Micaela F. Beckman ◽

Farah Bahrani Mougeot ◽

Jean-Luc C. Mougeot

Keyword(s):

Protein Interactions ◽

Association Studies ◽

Meta Analysis ◽

Holistic Approach ◽

Gene Interaction ◽

Snp Analysis ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Protein Protein Interactions ◽

Data Mining Approach

The COVID-19 pandemic has led to over 2.26 million deaths for almost 104 million confirmed cases worldwide, as of 4 February 2021 (WHO). Risk factors include pre-existing conditions such as cancer, cardiovascular disease, diabetes, and obesity. Although several vaccines have been deployed, there are few alternative anti-viral treatments available in the case of reduced or non-existent vaccine protection. Adopting a long-term holistic approach to cope with the COVID-19 pandemic appears critical with the emergence of novel and more infectious SARS-CoV-2 variants. Our objective was to identify comorbidity-associated single nucleotide polymorphisms (SNPs), potentially conferring increased susceptibility to SARS-CoV-2 infection using a computational meta-analysis approach. SNP datasets were downloaded from a publicly available genome-wide association studies (GWAS) catalog for 141 of 258 candidate COVID-19 comorbidities. Gene-level SNP analysis was performed to identify significant pathways by using the program MAGMA. An SNP annotation program was used to analyze MAGMA-identified genes. Differential gene expression was determined for significant genes across 30 general tissue types using the Functional and Annotation Mapping of GWAS online tool GENE2FUNC. COVID-19 comorbidities (n = 22) from six disease categories were found to have significant associated pathways, validated by Q–Q plots (p < 0.05). Protein–protein interactions of significant (p < 0.05) differentially expressed genes were visualized with the STRING program. Gene interaction networks were found to be relevant to SARS and influenza pathogenesis. In conclusion, we were able to identify the pathways potentially affected by or affecting SARS-CoV-2 infection in underlying medical conditions likely to confer susceptibility and/or the severity of COVID-19. Our findings have implications in future COVID-19 experimental research and treatment development.

Download Full-text

Involvement of astrocyte and oligodendrocyte gene sets in migraine

Cephalalgia ◽

10.1177/0333102415618614 ◽

2015 ◽

Vol 36 (7) ◽

pp. 640-647 ◽

Cited By ~ 8

Author(s):

Else Eising ◽

Christiaan de Leeuw ◽

Josine L Min ◽

Verneri Anttila ◽

Mark HG Verheijen ◽

...

Keyword(s):

Migraine With Aura ◽

Genetic Background ◽

Migraine Without Aura ◽

Protein Modification ◽

Association Studies ◽

Cell Types ◽

Familial Hemiplegic Migraine ◽

Genome Wide Association Studies ◽

Brain Disorder ◽

Gene Sets

Background Migraine is a common episodic brain disorder characterized by recurrent attacks of severe unilateral headache and additional neurological symptoms. Two main migraine types can be distinguished based on the presence of aura symptoms that can accompany the headache: migraine with aura and migraine without aura. Multiple genetic and environmental factors confer disease susceptibility. Recent genome-wide association studies (GWAS) indicate that migraine susceptibility genes are involved in various pathways, including neurotransmission, which have already been implicated in genetic studies of monogenic familial hemiplegic migraine, a subtype of migraine with aura. Methods To further explore the genetic background of migraine, we performed a gene set analysis of migraine GWAS data of 4954 clinic-based patients with migraine, as well as 13,390 controls. Curated sets of synaptic genes and sets of genes predominantly expressed in three glial cell types (astrocytes, microglia and oligodendrocytes) were investigated. Discussion Our results show that gene sets containing astrocyte- and oligodendrocyte-related genes are associated with migraine, which is especially true for gene sets involved in protein modification and signal transduction. Observed differences between migraine with aura and migraine without aura indicate that both migraine types, at least in part, seem to have a different genetic background.

Download Full-text

1000 Genomes-based meta-analysis identifies 10 novel loci for kidney function

Scientific Reports ◽

10.1038/srep45040 ◽

2017 ◽

Vol 7 (1) ◽

Cited By ~ 51

Author(s):

Mathias Gorski ◽

Peter J. van der Most ◽

Alexander Teumer ◽

Audrey Y. Chu ◽

Man Li ◽

...

Keyword(s):

Kidney Function ◽

Kidney Development ◽

Association Studies ◽

Meta Analysis ◽

European Ancestry ◽

P Value ◽

Genome Wide Association Studies ◽

1000 Genomes ◽

Gene Sets ◽

Project Promise

Abstract HapMap imputed genome-wide association studies (GWAS) have revealed >50 loci at which common variants with minor allele frequency >5% are associated with kidney function. GWAS using more complete reference sets for imputation, such as those from The 1000 Genomes project, promise to identify novel loci that have been missed by previous efforts. To investigate the value of such a more complete variant catalog, we conducted a GWAS meta-analysis of kidney function based on the estimated glomerular filtration rate (eGFR) in 110,517 European ancestry participants using 1000 Genomes imputed data. We identified 10 novel loci with p-value < 5 × 10−8 previously missed by HapMap-based GWAS. Six of these loci (HOXD8, ARL15, PIK3R1, EYA4, ASTN2, and EPB41L3) are tagged by common SNPs unique to the 1000 Genomes reference panel. Using pathway analysis, we identified 39 significant (FDR < 0.05) genes and 127 significantly (FDR < 0.05) enriched gene sets, which were missed by our previous analyses. Among those, the 10 identified novel genes are part of pathways of kidney development, carbohydrate metabolism, cardiac septum development and glucose metabolism. These results highlight the utility of re-imputing from denser reference panels, until whole-genome sequencing becomes feasible in large samples.

Download Full-text

Gene-Based Nonparametric Testing of Interactions Using Distance Correlation Coefficient in Case-Control Association Studies

Genes ◽

10.3390/genes9120608 ◽

2018 ◽

Vol 9 (12) ◽

pp. 608

Author(s):

Yingjie Guo ◽

Chenxi Wu ◽

Maozu Guo ◽

Xiaoyan Liu ◽

Alon Keinan

Keyword(s):

Correlation Coefficient ◽

Statistical Power ◽

Association Studies ◽

Gene Interaction ◽

P Value ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Real World Data ◽

Distance Correlation ◽

The Difference

Among the various statistical methods for identifying gene–gene interactions in qualitative genome-wide association studies (GWAS), gene-based methods have recently grown in popularity because they confer advantages in both statistical power and biological interpretability. However, most of these methods make strong assumptions about the form of the relationship between traits and single-nucleotide polymorphisms, which result in limited statistical power. In this paper, we propose a gene-based method based on the distance correlation coefficient called gene-based gene-gene interaction via distance correlation coefficient (GBDcor). The distance correlation (dCor) is a measurement of the dependency between two random vectors with arbitrary, and not necessarily equal, dimensions. We used the difference in dCor in case and control datasets as an indicator of gene–gene interaction, which was based on the assumption that the joint distribution of two genes in case subjects and in control subjects should not be significantly different if the two genes do not interact. We designed a permutation-based statistical test to evaluate the difference between dCor in cases and controls for a pair of genes, and we provided the p-value for the statistic to represent the significance of the interaction between the two genes. In experiments with both simulated and real-world data, our method outperformed previous approaches in detecting interactions accurately.

Download Full-text

An Exploration of Gene-Gene Interactions and Their Effects on Hypertension

International Journal of Genomics ◽

10.1155/2017/7208318 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 7

Author(s):

Ying Meng ◽

Susan Groth ◽

Jill R. Quinn ◽

John Bisognano ◽

Tong Tong Wu

Keyword(s):

Interaction Analysis ◽

Association Studies ◽

Independent Set ◽

Gene Interaction ◽

Single Locus ◽

Gene Interactions ◽

Genome Wide Association Studies ◽

Original Cohort ◽

Genome Wide ◽

Heart Study

Hypertension tends to perpetuate in families and the heritability of hypertension is estimated to be around 20–60%. So far, the main proportion of this heritability has not been found by single-locus genome-wide association studies. Therefore, the current study explored gene-gene interactions that have the potential to partially fill in the missing heritability. A two-stage discovery-confirmatory analysis was carried out in the Framingham Heart Study cohorts. The first stage was an exhaustive pairwise search performed in 2320 early-onset hypertensive cases with matched normotensive controls from the offspring cohort. Then, identified gene-gene interactions were assessed in an independent set of 694 subjects from the original cohort. Four unique gene-gene interactions were found to be related to hypertension. Three detected genes were recognized by previous studies, and the other 5 loci/genes (MAN1A1, LMO3, NPAP1/SNRPN, DNAL4, and RNA5SP455/KRT8P5) were novel findings, which had no strong main effect on hypertension and could not be easily identified by single-locus genome-wide studies. Also, by including the identified gene-gene interactions, more variance was explained in hypertension. Overall, our study provides evidence that the genome-wide gene-gene interaction analysis has the possibility to identify new susceptibility genes, which can provide more insights into the genetic background of blood pressure regulation.

Download Full-text

Identification of Critical Core Genes of Sarcoma Based on Centrality Analysis of Networks Nodes

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2020.3080 ◽

2020 ◽

Vol 10 (7) ◽

pp. 1776-1784

Author(s):

Shudong Wang ◽

Jixiao Wang ◽

Xinzeng Wang ◽

Yuanyuan Zhang ◽

Tao Yi

Keyword(s):

Association Studies ◽

Meta Analysis ◽

Complex Diseases ◽

Enrichment Analysis ◽

Gene Interaction ◽

Core Gene ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Gene Set ◽

Genome Wide

Genome-wide association studies (GWAS) are powerful tools for identifying pathogenic genes of complex diseases and revealing genetic structure of diseases. However, due to gene-to-gene interactions, only a part of the hereditary factors can be revealed. The meta-analysis based on GWAS can integrate gene expression data at multiple levels and reveal the complex relationship between genes. Therefore, we used meta-analysis to integrate GWAS data of sarcoma to establish complex networks and discuss their significant genes. Firstly, we established gene interaction networks based on the data of different subtypes of sarcoma to analyze the node centralities of genes. Secondly, we calculated the significant score of each gene according to the Staged Significant Gene Network Algorithm (SSGNA). Then, we obtained the critical gene set HYC of sarcoma by ranking the scores, and then combined Gene Ontology enrichment analysis and protein network analysis to further screen it. Finally, the critical core gene set Hcore containing 47 genes was obtained and validated by GEPIA analysis. Our method has certain generalization performance to the study of complex diseases with prior knowledge and it is a useful supplement to genome-wide association studies.

Download Full-text

Genome-Wide Association Studies Suggest Limited Immune Gene Enrichment in Schizophrenia Compared to 5 Autoimmune Diseases

Schizophrenia Bulletin ◽

10.1093/schbul/sbw059 ◽

2016 ◽

Vol 42 (5) ◽

pp. 1176-1184 ◽

Cited By ~ 34

Author(s):

Jennie G. Pouget ◽

Vanessa F. Gonçalves ◽

Sarah L. Spain ◽

Hilary K. Finucane ◽

Soumya Raychaudhuri ◽

...

Keyword(s):

Autoimmune Diseases ◽

Association Studies ◽

Genome Wide Association ◽

Immune Gene ◽

Genome Wide Association Studies ◽

Gene Enrichment ◽

Genome Wide

Download Full-text

Incorporating biological knowledge in the search for gene × gene interaction in genome-wide association studies

BMC Proceedings ◽

10.1186/1753-6561-3-s7-s81 ◽

2009 ◽

Vol 3 (S7) ◽

Cited By ~ 3

Author(s):

Alisa K Manning ◽

Julius Suh Ngwa ◽

Audrey E Hendricks ◽

Ching-Ti Liu ◽

Andrew D Johnson ◽

...

Keyword(s):

Association Studies ◽

Gene Interaction ◽

Genome Wide Association ◽

Biological Knowledge ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

TWAS pathway method greatly enhances the number of leads for uncovering the molecular underpinnings of psychiatric disorders

10.1101/373050 ◽

2018 ◽

Author(s):

Chris Chatzinakos ◽

Donghyung Lee ◽

Na Cai ◽

Vladimir I. Vladimirov ◽

Anna Docherty ◽

...

Keyword(s):

Psychiatric Disorders ◽

Association Studies ◽

Genome Wide Association Studies ◽

Computational Burden ◽

Gene Sets ◽

Genome Wide ◽

Genetic Signal ◽

Meta Analyses ◽

Combine Information ◽

Or Genes

ABSTRACTGenetic signal detection in genome-wide association studies (GWAS) is enhanced by pooling small signals from multiple Single Nucleotide Polymorphism (SNP), e.g. across genes and pathways. Because genes are believed to influence traits via gene expression, it is of interest to combine information from expression Quantitative Trait Loci (eQTLs) in a gene or genes in the same pathway. Such methods, widely referred as transcriptomic wide association analysis (TWAS), already exist for gene analysis. Due to the possibility of eliminating most of the confounding effect of linkage disequilibrium (LD) from TWAS gene statistics, pathway TWAS methods would be very useful in uncovering the true molecular bases of psychiatric disorders. However, such methods are not yet available for arbitrarily large pathways/gene sets. This is possibly due to it quadratic (in the number of SNPs) computational burden for computing LD across large regions. To overcome this obstacle, we propose JEPEGMIX2-P, a novel TWAS pathway method that i) has a linear computational burden, ii) uses a large and diverse reference panel (33K subjects), iii) is competitive (adjusts for background enrichment in gene TWAS statistics) and iv) is applicable as-is to ethnically mixed cohorts. To underline its potential for increasing the power to uncover genetic signals over the state-of-the-art and commonly used non-transcriptomics methods, e.g. MAGMA, we applied JEPEGMIX2-P to summary statistics of most large meta-analyses from Psychiatric Genetics Consortium (PGC). While our work is just the very first step toward clinical translation of psychiatric disorders, PGC anorexia results suggest a possible avenue for treatment.

Download Full-text

Polygenic risk scores for psychiatric, inflammatory, and cardio-metabolic traits and diseases highlight possible genetic overlaps with suicide attempt and treatment-emergent suicidal ideation

10.1101/2021.03.08.21253145 ◽

2021 ◽

Author(s):

Giuseppe Fanelli ◽

Marcus Sokolowski ◽

Danuta Wasserman ◽

Siegfried Kasper ◽

Joseph Zohar ◽

...

Keyword(s):

Suicidal Ideation ◽

Suicide Attempt ◽

Association Studies ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Genetic Covariance ◽

Polygenic Risk ◽

Metabolic Traits ◽

Gene Sets ◽

Artery Disease

AbstractSuicide is the second leading cause of death among young people. Genetics may contribute to suicidal phenotypes and their co-occurrence in other psychiatric and medical conditions. Our study aimed to investigate the association of polygenic risk scores (PRSs) for 22 psychiatric, inflammatory, and cardio-metabolic traits and diseases with suicide attempt (SA) or treatment-worsening/emergent suicidal ideation (TWESI).PRSs were computed based on summary statistics of genome-wide association studies. Regression analyses were performed between PRSs and SA or TWESI in four clinical cohorts, including up to 3,834 individuals, and results were meta-analyzed across samples. Stratified genetic covariance analyses were performed to investigate the biology underlying cross-phenotype PRS associations. After Bonferroni correction, PRS for major depressive disorder (MDD) was positively associated with SA (p=1.7e-4). Nominal associations were shown between PRSs for coronary artery disease (CAD) (p=4.6e-3) or loneliness (p=0.009) and SA, PRSs for MDD or CAD and TWESI (p=0.033 and p=0.032, respectively). Genetic covariance between MDD and SA was shown in 35 gene sets related to drugs having anti-suicidal effects.A higher genetic liability for MDD may underlie a higher risk of SA. Further, but milder, possible modulatory factors are genetic risk for loneliness and CAD.

Download Full-text