Genome wide association with quantitative resistance phenotypes in Mycobacterium tuberculosis reveals novel resistance genes and regulatory regions

Mapping Intimacies ◽

10.1101/429159 ◽

2018 ◽

Cited By ~ 2

Author(s):

Maha R Farhat ◽

Luca Freschi ◽

Roger Calderon ◽

Thomas Ioerger ◽

Matthew Snyder ◽

...

Keyword(s):

Drug Resistance ◽

Mycobacterium Tuberculosis ◽

Resistance Genes ◽

Mixed Model ◽

Linear Mixed Model ◽

Quantitative Resistance ◽

Quantitative Measure ◽

Resistance Phenotype ◽

Noncoding Regions ◽

Genome Wide

AbstractDrug resistance is threatening attempts at tuberculosis epidemic control. Molecular diagnostics for drug resistance that rely on the detection of resistance-related mutations could expedite patient care and accelerate progress in TB eradication. We performed minimum inhibitory concentration testing for 12 anti-TB drugs together with Illumina whole genome sequencing on 1452 clinical Mycobacterium tuberculosis (MTB) isolates. We then used a linear mixed model to evaluate genome wide associations between mutations in MTB genes or noncoding regions and drug resistance, followed by validation of our findings in an independent dataset of 792 patient isolates. Novel associations at 13 genomic loci were confirmed in the validation set, with 2 involving noncoding regions. We found promoter mutations to have smaller average effects on resistance levels than gene body mutations in genes where both can contribute to resistance. Enabled by a quantitative measure of resistance, we estimated the heritability of the resistance phenotype to 11 anti-TB drugs and identify a lower than expected contribution from known resistance genes. We also report the proportion of variation in resistance levels explained by the novel loci identified here. This study highlights the complexity of the genomic mechanisms associated with the MTB resistance phenotype, including the relatively large number of potentially causative or compensatory loci, and emphasizes the contribution of the noncoding portion of the genome.

Download Full-text

GWAS-Flow: A GPU accelerated framework for efficient permutation based genome-wide association studies

10.1101/783100 ◽

2019 ◽

Cited By ~ 2

Author(s):

Jan A. Freudenthal ◽

Markus J. Ankenbrand ◽

Dominik G. Grimm ◽

Arthur Korte

Keyword(s):

Complex Traits ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Large Datasets ◽

Genome Wide Association ◽

Small Data ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Non Gaussian

AbstractMotivationGenome-wide association studies (GWAS) are one of the most commonly used methods to detect associations between complex traits and genomic polymorphisms. As both genotyping and phenotyping of large populations has become easier, typical modern GWAS have to cope with massive amounts of data. Thus, the computational demand for these analyses grew remarkably during the last decades. This is especially true, if one wants to implement permutation-based significance thresholds, instead of using the naïve Bonferroni threshold. Permutation-based methods have the advantage to provide an adjusted multiple hypothesis correction threshold that takes the underlying phenotypic distribution into account and will thus remove the need to find the correct transformation for non Gaussian phenotypes. To enable efficient analyses of large datasets and the possibility to compute permutation-based significance thresholds, we used the machine learning framework TensorFlow to develop a linear mixed model (GWAS-Flow) that can make use of the available CPU or GPU infrastructure to decrease the time of the analyses especially for large datasets.ResultsWe were able to show that our application GWAS-Flow outperforms custom GWAS scripts in terms of speed without loosing accuracy. Apart from p-values, GWAS-Flow also computes summary statistics, such as the effect size and its standard error for each individual marker. The CPU-based version is the default choice for small data, while the GPU-based version of GWAS-Flow is especially suited for the analyses of big data.AvailabilityGWAS-Flow is freely available on GitHub (https://github.com/Joyvalley/GWAS_Flow) and is released under the terms of the MIT-License.

Download Full-text

Variable selection in heterogeneous datasets: A truncated-rank sparse linear mixed model with applications to genome-wide association studies

Methods ◽

10.1016/j.ymeth.2018.04.021 ◽

2018 ◽

Vol 145 ◽

pp. 2-9 ◽

Cited By ~ 1

Author(s):

Haohan Wang ◽

Bryon Aragam ◽

Eric P. Xing

Keyword(s):

Variable Selection ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Heterogeneous Datasets

Download Full-text

Genome-Wide Association Studies Reveal Susceptibility Loci for Digital Dermatitis in Holstein Cattle

Animals ◽

10.3390/ani10112009 ◽

2020 ◽

Vol 10 (11) ◽

pp. 2009

Author(s):

Ellen Lai ◽

Alexa L. Danner ◽

Thomas R. Famula ◽

Anita M. Oberbauer

Keyword(s):

Predictive Value ◽

Mixed Model ◽

Linear Mixed Model ◽

Bos Taurus ◽

Association Studies ◽

Bayesian Regression ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Digital Dermatitis ◽

Genome Wide

Digital dermatitis (DD) causes lameness in dairy cattle. To detect the quantitative trait loci (QTL) associated with DD, genome-wide association studies (GWAS) were performed using high-density single nucleotide polymorphism (SNP) genotypes and binary case/control, quantitative (average number of FW per hoof trimming record) and recurrent (cases with ≥2 DD episodes vs. controls) phenotypes from cows across four dairies (controls n = 129 vs. FW n = 85). Linear mixed model (LMM) and random forest (RF) approaches identified the top SNPs, which were used as predictors in Bayesian regression models to assess the SNP predictive value. The LMM and RF analyses identified QTL regions containing candidate genes on Bos taurus autosome (BTA) 2 for the binary and recurrent phenotypes and BTA7 and 20 for the quantitative phenotype that related to epidermal integrity, immune function, and wound healing. Although larger sample sizes are necessary to reaffirm these small effect loci amidst a strong environmental effect, the sample cohort used in this study was sufficient for estimating SNP effects with a high predictive value.

Download Full-text

GWAS for quantitative resistance phenotypes in Mycobacterium tuberculosis reveals resistance genes and regulatory regions

Nature Communications ◽

10.1038/s41467-019-10110-6 ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 26

Author(s):

Maha R. Farhat ◽

Luca Freschi ◽

Roger Calderon ◽

Thomas Ioerger ◽

Matthew Snyder ◽

...

Keyword(s):

Mycobacterium Tuberculosis ◽

Resistance Genes ◽

Quantitative Resistance ◽

Regulatory Regions

Download Full-text

Variable selection in heterogeneous datasets: A truncated-rank sparse linear mixed model with applications to genome-wide association studies

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2017.8217687 ◽

2017 ◽

Cited By ~ 9

Author(s):

Haohan Wang ◽

Bryon Aragam ◽

Eric P. Xing

Keyword(s):

Variable Selection ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Heterogeneous Datasets

Download Full-text

Local Genealogies in a Linear Mixed Model for Genome-Wide Association Mapping in Complex Pedigreed Populations

PLoS ONE ◽

10.1371/journal.pone.0027061 ◽

2011 ◽

Vol 6 (11) ◽

pp. e27061 ◽

Cited By ~ 2

Author(s):

Goutam Sahana ◽

Thomas Mailund ◽

Mogens Sandø Lund ◽

Bernt Guldbrandtsen

Keyword(s):

Association Mapping ◽

Mixed Model ◽

Linear Mixed Model ◽

Genome Wide Association ◽

Genome Wide

Download Full-text

A high-coverage artificial chromosome library for the genome-wide screening of drug-resistance genes in malaria parasites

Genome Research ◽

10.1101/gr.124164.111 ◽

2012 ◽

Vol 22 (5) ◽

pp. 985-992 ◽

Cited By ~ 8

Author(s):

S. Iwanaga ◽

I. Kaneko ◽

M. Yuda

Keyword(s):

Drug Resistance ◽

Resistance Genes ◽

Artificial Chromosome ◽

Malaria Parasites ◽

High Coverage ◽

Genome Wide

Download Full-text

Variable Selection in Heterogeneous Datasets: A Truncated-rank Sparse Linear Mixed Model with Applications to Genome-wide Association Studies

10.1101/228106 ◽

2017 ◽

Cited By ~ 2

Author(s):

Haohan Wang ◽

Bryon Aragam ◽

Eric P. Xing

Keyword(s):

Population Structure ◽

Variable Selection ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Low Rank ◽

Genome Wide Association Studies ◽

Unified Framework ◽

Genome Wide

AbstractA fundamental and important challenge in modern datasets of ever increasing dimensionality is variable selection, which has taken on renewed interest recently due to the growth of biological and medical datasets with complex, non-i.i.d. structures. Naïvely applying classical variable selection methods such as the Lasso to such datasets may lead to a large number of false discoveries. Motivated by genome-wide association studies in genetics, we study the problem of variable selection for datasets arising from multiple subpopulations, when this underlying population structure is unknown to the researcher. We propose a unified framework for sparse variable selection that adaptively corrects for population structure via a low-rank linear mixed model. Most importantly, the proposed method does not require prior knowledge of sample structure in the data and adaptively selects a covariance structure of the correct complexity. Through extensive experiments, we illustrate the effectiveness of this framework over existing methods. Further, we test our method on three different genomic datasets from plants, mice, and human, and discuss the knowledge we discover with our method.

Download Full-text

Mutations in dnaA and a cryptic interaction site increase drug resistance in Mycobacterium tuberculosis

PLoS Pathogens ◽

10.1371/journal.ppat.1009063 ◽

2020 ◽

Vol 16 (11) ◽

pp. e1009063

Author(s):

Nathan D. Hicks ◽

Samantha R. Giffen ◽

Peter H. Culviner ◽

Michael C. Chao ◽

Charles L. Dulberger ◽

...

Keyword(s):

Drug Resistance ◽

Mycobacterium Tuberculosis ◽

Genome Wide Association Study ◽

Critical Concentration ◽

Drug Susceptibility ◽

Interaction Site ◽

Clinical Strains ◽

Synonymous Mutations ◽

Genome Wide ◽

A Genome

Genomic dissection of antibiotic resistance in bacterial pathogens has largely focused on genetic changes conferring growth above a single critical concentration of drug. However, reduced susceptibility to antibiotics—even below this breakpoint—is associated with poor treatment outcomes in the clinic, including in tuberculosis. Clinical strains of Mycobacterium tuberculosis exhibit extensive quantitative variation in antibiotic susceptibility but the genetic basis behind this spectrum of drug susceptibility remains ill-defined. Through a genome wide association study, we show that non-synonymous mutations in dnaA, which encodes an essential and highly conserved regulator of DNA replication, are associated with drug resistance in clinical M. tuberculosis strains. We demonstrate that these dnaA mutations specifically enhance M. tuberculosis survival during isoniazid treatment via reduced expression of katG, the activator of isoniazid. To identify DnaA interactors relevant to this phenotype, we perform the first genome-wide biochemical mapping of DnaA binding sites in mycobacteria which reveals a DnaA interaction site that is the target of recurrent mutation in clinical strains. Reconstructing clinically prevalent mutations in this DnaA interaction site reproduces the phenotypes of dnaA mutants, suggesting that clinical strains of M. tuberculosis have evolved mutations in a previously uncharacterized DnaA pathway that quantitatively increases resistance to the key first-line antibiotic isoniazid. Discovering genetic mechanisms that reduce drug susceptibility and support the evolution of high-level drug resistance will guide development of biomarkers capable of prospectively identifying patients at risk of treatment failure in the clinic.

Download Full-text

Multivariate genome-wide association study of leaf shape in a Populus deltoides and P. simonii F1 pedigree

PLoS ONE ◽

10.1371/journal.pone.0259278 ◽

2021 ◽

Vol 16 (10) ◽

pp. e0259278

Author(s):

Wenguo Yang ◽

Dan Yao ◽

Hainan Wu ◽

Wei Zhao ◽

Yuhua Chen ◽

...

Keyword(s):

Leaf Morphology ◽

Mixed Model ◽

Linear Mixed Model ◽

Populus Deltoides ◽

Leaf Shape ◽

Leaf Traits ◽

Leaf Length ◽

Moderate Number ◽

Genome Wide ◽

Poplar Leaf

Leaf morphology exhibits tremendous diversity between and within species, and is likely related to adaptation to environmental factors. Most poplar species are of great economic and ecological values and their leaf morphology can be a good predictor for wood productivity and environment adaptation. It is important to understand the genetic mechanism behind variation in leaf shape. Although some initial efforts have been made to identify quantitative trait loci (QTLs) for poplar leaf traits, more effort needs to be expended to unravel the polygenic architecture of the complex traits of leaf shape. Here, we performed a genome-wide association analysis (GWAS) of poplar leaf shape traits in a randomized complete block design with clones from F1 hybrids of Populus deltoides and Populus simonii. A total of 35 SNPs were identified as significantly associated with the multiple traits of a moderate number of regular polar radii between the leaf centroid and its edge points, which could represent the leaf shape, based on a multivariate linear mixed model. In contrast, the univariate linear mixed model was applied as single leaf traits for GWAS, leading to genomic inflation; thus, no significant SNPs were detected for leaf length, measures of leaf width, leaf area, or the ratio of leaf length to leaf width under genomic control. Investigation of the candidate genes showed that most flanking regions of the significant leaf shape-associated SNPs harbored genes that were related to leaf growth and development and to the regulation of leaf morphology. The combined use of the traditional experimental design and the multivariate linear mixed model could greatly improve the power in GWAS because the multiple trait data from a large number of individuals with replicates of clones were incorporated into the statistical model. The results of this study will enhance the understanding of the genetic mechanism of leaf shape variation in Populus. In addition, a moderate number of regular leaf polar radii can largely represent the leaf shape and can be used for GWAS of such a complicated trait in Populus, instead of the higher-dimensional regular radius data that were previously considered to well represent leaf shape.

Download Full-text