scholarly journals variancePartition: Interpreting drivers of variation in complex gene expression studies

2016 ◽  
Author(s):  
Gabriel E Hoffman ◽  
Eric E Schadt

As genomics studies become more complex and consider multiple sources of biological and technical variation, characterizing these drivers of variation becomes essential to understanding disease biology and regulatory genetics. We describe a statistical and visualization framework, variancePartition, to prioritize drivers of variation with a genome-wide summary, and identify genes that deviate from the genome-wide trend. variancePartition enables rapid interpretation of complex gene expression studies and is applicable to many genomics assays.

2007 ◽  
Vol 8 (5) ◽  
pp. R74 ◽  
Author(s):  
Björn Nilsson ◽  
Petra Håkansson ◽  
Mikael Johansson ◽  
Sven Nelander ◽  
Thoas Fioretos

2017 ◽  
Author(s):  
Chris Chatzinakos ◽  
Donghyung Lee ◽  
Bradley T Webb ◽  
Vladimir I Vladimirov ◽  
Kenneth S Kendler ◽  
...  

AbstractMotivationTo increase detection power, researchers use gene level analysis methods to aggregate weak marker signals. Due to gene expression controlling biological processes, researchers proposed aggregating signals for expression Quantitative Trait Loci (eQTL). Most gene-level eQTL methods make statistical inferences based on i) summary statistics from genome-wide association studies (GWAS) and ii) linkage disequilibrium (LD) patterns from a relevant reference panel. While most such tools assume homogeneous cohorts, our Gene-level Joint Analysis of functional SNPs in Cosmopolitan Cohorts (JEPEGMIX) method accommodates cosmopolitan cohorts by using heterogeneous panels. However, JEPGMIX relies on brain eQTLs from older gene expression studies and does not adjust for background enrichment in GWAS signals.ResultsWe propose JEPEGMIX2, an extension of JEPEGMIX. When compared to JPEGMIX, it uses i) cis-eQTL SNPs from the latest expression studies and ii) brains specific (sub)tissues and tissues other than brain. JEPEGMIX2 also i) avoids accumulating averagely enriched polygenic information by adjusting for background enrichment and ii), to avoid an increase in false positive rates for studies with numerous highly enriched (above the background) genes, it outputs gene q-values based on Holm adjustment of [email protected] informationSupplementary material is available at Bioinformatics online.


PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5207 ◽  
Author(s):  
Vinayak Palve ◽  
Manisha Pareek ◽  
Neeraja M. Krishnan ◽  
Gangotri Siddappa ◽  
Amritha Suresh ◽  
...  

Selection of the right reference gene(s) is crucial in the analysis and interpretation of gene expression data. The aim of the present study was to discover and validate a minimal set of internal control genes in head and neck tumor studies. We analyzed data from multiple sources (in house whole-genome gene expression microarrays, previously published quantitative real-time PCR (qPCR) data and RNA-seq data from TCGA) to come up with a list of 18 genes (discovery set) that had the lowest variance, a high level of expression across tumors, and their matched normal samples. The genes in the discovery set were ranked using four different algorithms (BestKeeper, geNorm, NormFinder, and comparative delta Ct) and a web-based comparative tool, RefFinder, for their stability and variance in expression across tissues. Finally, we validated their expression using qPCR in an additional set of tumor:matched normal samples that resulted in five genes (RPL30, RPL27, PSMC5, MTCH1, and OAZ1), out of which RPL30 and RPL27 were most stable and were abundantly expressed across the tissues. Our data suggest that RPL30 or RPL27 in combination with either PSMC5 or MTCH1 or OAZ1 can be used as a minimal set of control genes in head and neck tumor gene expression studies.


2017 ◽  
Author(s):  
Alejandro Cáceres ◽  
Juan R. González

AbstractThere is great interest to study how co-expression gene networks change across tissues. However, the reproducibility assessment of these studies is challenged by a lack of fully confirmatory experiments from independent researchers. While an increment in the number of studies with expression data for several tissues is expected, statistical measures are still needed to assess the reproducibility between studies. We identified a gap in the statistical literature concerning the assessment of agreement between studies across numerous conditions. The gap precluded us to test, using standard statistics, the level of agreement between the GTEX (RNAseq) and BRAINEAC (microarray) studies to distinguish the structure of co-expression networks across four brain tissues. We propose a generalization of a classical measure of agreement, Cohen’s κ, derive its distributional characteristics and determine its reliability properties. In the gene expression studies, our generalization of κ showed full agreement for genome-wide networks in BRAINEAC benchmarked against GTEX, and highest agreement for brain specific pathways. Our highly interpretable measure can contribute to anticipated efforts on reproducibility research.


2021 ◽  
Vol 12 ◽  
Author(s):  
Mohamed Tarek Badr ◽  
Mohamed Omar ◽  
Georg Häcker

Helicobacter pylori is a gram-negative bacterium that colonizes the human gastric mucosa and can lead to gastric inflammation, ulcers, and stomach cancer. Due to the increase in H. pylori antimicrobial resistance new methods to identify the molecular mechanisms of H. pylori-induced pathology are urgently needed. Here we utilized a computational biology approach, harnessing genome-wide association and gene expression studies to identify genes and pathways determining disease development. We mined gene expression data related to H. pylori-infection and its complications from publicly available databases to identify four human datasets as discovery datasets and used two different multi-cohort analysis pipelines to define a H. pylori-induced gene signature. An initial Helicobacter-signature was curated using the MetaIntegrator pipeline and validated in cell line model datasets. With this approach we identified cell line models that best match gene regulation in human pathology. A second analysis pipeline through NetworkAnalyst was used to refine our initial signature. This approach defined a 55-gene signature that is stably deregulated in disease conditions. The 55-gene signature was validated in datasets from human gastric adenocarcinomas and could separate tumor from normal tissue. As only a small number of H. pylori patients develop cancer, this gene-signature must interact with other host and environmental factors to initiate tumorigenesis. We tested for possible interactions between our curated gene signature and host genomic background mutations and polymorphisms by integrating genome-wide association studies (GWAS) and known oncogenes. We analyzed public databases to identify genes harboring single nucleotide polymorphisms (SNPs) associated with gastric pathologies and driver genes in gastric cancers. Using this approach, we identified 37 genes from GWA studies and 61 oncogenes, which were used with our 55-gene signature to map gene-gene interaction networks. In conclusion, our analysis defines a unique gene signature driven by H. pylori-infection at early phases and that remains relevant through different stages of pathology up to gastric cancer, a stage where H. pylori itself is rarely detectable. Furthermore, this signature elucidates many factors of host gene and pathway regulation in infection and can be used as a target for drug repurposing and testing of infection models suitability to investigate human infection.


2013 ◽  
Vol 13 (1) ◽  
Author(s):  
Fayaz Seifuddin ◽  
Mehdi Pirooznia ◽  
Jennifer T Judy ◽  
Fernando S Goes ◽  
James B Potash ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document