Use of Wrapper Algorithms Coupled with a Random Forests Classifier for Variable Selection in Large-Scale Genomic Association Studies

Andrei S. Rodin; Anatoliy Litvinenko; Kathy Klos; Alanna C. Morrison; Trevor Woodage; Josef Coresh; Eric Boerwinkle

doi:10.1089/cmb.2008.0037

Sex differences in white adipose tissue expansion: emerging molecular mechanisms

Clinical Science ◽

10.1042/cs20210086 ◽

2021 ◽

Vol 135 (24) ◽

pp. 2691-2708

Author(s):

Simon T. Bond ◽

Anna C. Calkin ◽

Brian G. Drew

Keyword(s):

Sex Differences ◽

Sexual Dimorphism ◽

Gene Networks ◽

Large Scale ◽

Molecular Mechanisms ◽

Association Studies ◽

Tissue Expansion ◽

Economic Systems ◽

Genomic Association ◽

Males And Females

Abstract The escalating prevalence of individuals becoming overweight and obese is a rapidly rising global health problem, placing an enormous burden on health and economic systems worldwide. Whilst obesity has well described lifestyle drivers, there is also a significant and poorly understood component that is regulated by genetics. Furthermore, there is clear evidence for sexual dimorphism in obesity, where overall risk, degree, subtype and potential complications arising from obesity all differ between males and females. The molecular mechanisms that dictate these sex differences remain mostly uncharacterised. Many studies have demonstrated that this dimorphism is unable to be solely explained by changes in hormones and their nuclear receptors alone, and instead manifests from coordinated and highly regulated gene networks, both during development and throughout life. As we acquire more knowledge in this area from approaches such as large-scale genomic association studies, the more we appreciate the true complexity and heterogeneity of obesity. Nevertheless, over the past two decades, researchers have made enormous progress in this field, and some consistent and robust mechanisms continue to be established. In this review, we will discuss some of the proposed mechanisms underlying sexual dimorphism in obesity, and discuss some of the key regulators that influence this phenomenon.

Download Full-text

Concept, Design and Implementation of a Cardiovascular Gene-Centric 50 K SNP Array for Large-Scale Genomic Association Studies

PLoS ONE ◽

10.1371/journal.pone.0003583 ◽

2008 ◽

Vol 3 (10) ◽

pp. e3583 ◽

Cited By ~ 294

Author(s):

Brendan J. Keating ◽

Sam Tischfield ◽

Sarah S. Murray ◽

Tushar Bhangale ◽

Thomas S. Price ◽

...

Keyword(s):

Large Scale ◽

Association Studies ◽

Snp Array ◽

Concept Design ◽

Design And Implementation ◽

Genomic Association

Download Full-text

Bayesian variable selection regression for genome-wide association studies and other large-scale problems

The Annals of Applied Statistics ◽

10.1214/11-aoas455 ◽

2011 ◽

Vol 5 (3) ◽

pp. 1780-1815 ◽

Cited By ~ 194

Author(s):

Yongtao Guan ◽

Matthew Stephens

Keyword(s):

Variable Selection ◽

Large Scale ◽

Association Studies ◽

Bayesian Variable Selection ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Large Scale Problems ◽

Genome Wide

Download Full-text

r2VIM: A new variable selection method for random forests in genome-wide association studies

BioData Mining ◽

10.1186/s13040-016-0087-3 ◽

2016 ◽

Vol 9 (1) ◽

Cited By ~ 24

Author(s):

Silke Szymczak ◽

Emily Holzinger ◽

Abhijit Dasgupta ◽

James D. Malley ◽

Anne M. Molloy ◽

...

Keyword(s):

Variable Selection ◽

Random Forests ◽

Association Studies ◽

Selection Method ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Variable Selection Method ◽

Genome Wide

Download Full-text

Optimized permutation testing for information theoretic measures of multi-gene interactions

BMC Bioinformatics ◽

10.1186/s12859-021-04107-6 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

James M. Kunert-Graf ◽

Nikita A. Sakhanenko ◽

David J. Galas

Keyword(s):

Large Scale ◽

Permutation Test ◽

Association Studies ◽

Genome Wide Association Studies ◽

Permutation Testing ◽

Exact Test ◽

Information Theoretic ◽

Information Theoretic Measures ◽

Full Analysis ◽

Computational Bottleneck

Abstract Background Permutation testing is often considered the “gold standard” for multi-test significance analysis, as it is an exact test requiring few assumptions about the distribution being computed. However, it can be computationally very expensive, particularly in its naive form in which the full analysis pipeline is re-run after permuting the phenotype labels. This can become intractable in multi-locus genome-wide association studies (GWAS), in which the number of potential interactions to be tested is combinatorially large. Results In this paper, we develop an approach for permutation testing in multi-locus GWAS, specifically focusing on SNP–SNP-phenotype interactions using multivariable measures that can be computed from frequency count tables, such as those based in Information Theory. We find that the computational bottleneck in this process is the construction of the count tables themselves, and that this step can be eliminated at each iteration of the permutation testing by transforming the count tables directly. This leads to a speed-up by a factor of over 103 for a typical permutation test compared to the naive approach. Additionally, this approach is insensitive to the number of samples making it suitable for datasets with large number of samples. Conclusions The proliferation of large-scale datasets with genotype data for hundreds of thousands of individuals enables new and more powerful approaches for the detection of multi-locus genotype-phenotype interactions. Our approach significantly improves the computational tractability of permutation testing for these studies. Moreover, our approach is insensitive to the large number of samples in these modern datasets. The code for performing these computations and replicating the figures in this paper is freely available at https://github.com/kunert/permute-counts.

Download Full-text

Guidelines for Large-Scale Sequence-Based Complex Trait Association Studies: Lessons Learned from the NHLBI Exome Sequencing Project

The American Journal of Human Genetics ◽

10.1016/j.ajhg.2016.08.012 ◽

2016 ◽

Vol 99 (4) ◽

pp. 791-801 ◽

Cited By ~ 48

Author(s):

Paul L. Auer ◽

Alex P. Reiner ◽

Gao Wang ◽

Hyun Min Kang ◽

Goncalo R. Abecasis ◽

...

Keyword(s):

Exome Sequencing ◽

Large Scale ◽

Association Studies ◽

Complex Trait ◽

Lessons Learned ◽

Sequencing Project ◽

Trait Association ◽

Exome Sequencing Project ◽

Scale Sequence

Download Full-text

Power comparison of Cochran-Armitage trend test against allelic and genotypic tests in large-scale case-control genetic association studies

Statistical Methods in Medical Research ◽

10.1177/0962280216683979 ◽

2016 ◽

Vol 27 (9) ◽

pp. 2657-2673 ◽

Cited By ~ 2

Author(s):

Mathieu Emily

Keyword(s):

Large Scale ◽

Association Studies ◽

Disease Model ◽

Trend Test ◽

Genome Wide Association Studies ◽

Power Functions ◽

Power Comparison ◽

Powerful Test ◽

Armitage Trend Test ◽

Mode Of Inheritance

The Cochran-Armitage trend test (CA) has become a standard procedure for association testing in large-scale genome-wide association studies (GWAS). However, when the disease model is unknown, there is no consensus on the most powerful test to be used between CA, allelic, and genotypic tests. In this article, we tackle the question of whether CA is best suited to single-locus scanning in GWAS and propose a power comparison of CA against allelic and genotypic tests. Our approach relies on the evaluation of the Taylor decompositions of non-centrality parameters, thus allowing an analytical comparison of the power functions of the tests. Compared to simulation-based comparison, our approach offers the advantage of simultaneously accounting for the multidimensionality of the set of features involved in power functions. Although power for CA depends on the sample size, the case-to-control ratio and the minor allelic frequency (MAF), our results first show that it is largely influenced by the mode of inheritance and a deviation from Hardy–Weinberg Equilibrium (HWE). Furthermore, when compared to other tests, CA is shown to be the most powerful test under a multiplicative disease model or when the single-nucleotide polymorphism largely deviates from HWE. In all other situations, CA lacks in power and differences can be substantial, especially for the recessive mode of inheritance. Finally, our results are illustrated by the comparison of the performances of the statistics in two genome scans.

Download Full-text

A large‐scale genomic association analysis identifies a fragment in Dt11 chromosome conferring cotton Verticillium wilt resistance

Plant Biotechnology Journal ◽

10.1111/pbi.13650 ◽

2021 ◽

Author(s):

Yan Zhang ◽

Bin Chen ◽

Zhengwen Sun ◽

Zhengwen Liu ◽

Yanru Cui ◽

...

Keyword(s):

Association Analysis ◽

Verticillium Wilt ◽

Large Scale ◽

Wilt Resistance ◽

Verticillium Wilt Resistance ◽

Genomic Association

Download Full-text

Within and across populations complex traits and diseases prediction using summary statistics from large-scale genomewide association studies

10.14264/11574da ◽

2021 ◽

Author(s):

◽

Ying Wang

Keyword(s):

Complex Traits ◽

Large Scale ◽

Association Studies ◽

Summary Statistics ◽

Genomewide Association

Download Full-text

Imputation-Aware Tag SNP Selection To Improve Power for Large-Scale, Multi-ethnic Association Studies

G3 Genes|Genome|Genetics ◽

10.1534/g3.118.200502 ◽

2018 ◽

Vol 8 (10) ◽

pp. 3255-3267 ◽

Cited By ~ 10

Author(s):

Genevieve L. Wojcik ◽

Christian Fuchsberger ◽

Daniel Taliun ◽

Ryan Welch ◽

Alicia R Martin ◽

...

Keyword(s):

Large Scale ◽

Association Studies ◽

Tag Snp ◽

Tag Snp Selection

Download Full-text