Comparing GWAS Results of Complex Traits Using Full Genetic Model and Additive Models for Revealing Genetic Architecture

Md. Mamun Monir; Jun Zhu

doi:10.1038/srep38600

PSVIII-26 Gene editing of complex traits

Journal of Animal Science ◽

10.1093/jas/skz258.529 ◽

2019 ◽

Vol 97 (Supplement_3) ◽

pp. 259-260

Author(s):

Ashley Ling ◽

Romdhane Rekaya

Keyword(s):

Genetic Engineering ◽

Complex Traits ◽

Genetic Architecture ◽

Quantitative Traits ◽

Gene Editing ◽

Genetic Model ◽

Variable Number ◽

Production Parameters ◽

Potential Gain ◽

Better Than

Abstract Gene editing (GE) is a form of genetic engineering in which DNA is removed, inserted or replaced. For simple monogenic traits, the technology has been successfully implemented to create heritable modifications in animals and plants. The benefits of these niche applications are undeniable. For quantitative traits the benefits of GE are hard to quantify mainly because these traits are not genetic enough (low to moderate heritability) and their genetic architecture is often complex. Because its impact on the gene pool through the introduction of heritable modifications, the potential gain from GE must be evaluated within reasonable production parameters and in comparison, with available tools used in animal selection. A simulation was performed to compare GE with genomic selection (GS) and QTN-assisted selection (QAS) under four experimental factors: 1) heritability (0.1 or 0.4), 2) number of QTN affecting the trait (1000 or 10000) and their effect distribution (Gamma or uniform); 3) Percentage of selected females (100% or 33%); and 4) fixed or variable number of edited QTNs. Three models GS (M1), GS and GE (M2), and GS and QAS (M3) were implemented and compared. When the QTN effects were sampled from a Gamma distribution, all females were selected, and non-segregating QTNs were replaced, M2 clearly outperformed M1 and M3, with superiority ranging from 19 to 61%. Under the same scenario, M3 was 7 to 23% superior to M1. As the complexity of the genetic model increased (10000 QTN; uniform distribution), only one third of the females were selected, and the number of edited QTNs was fixed, the superiority of M2 was significantly reduced. In fact, M2 was only slightly better than M3 (2 to 6%). In all cases, M2 and M3 were better than M1. These results indicate that under realistic scenarios, GE for complex traits might have only limited advantages.

Download Full-text

Phantom Epistasis in Genomic Selection: On the Predictive Ability of Epistatic Models

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401300 ◽

2020 ◽

Vol 10 (9) ◽

pp. 3137-3145 ◽

Cited By ~ 3

Author(s):

Matías F Schrauf ◽

Johannes W R Martini ◽

Henner Simianer ◽

Gustavo de los Campos ◽

Rodolfo Cantet ◽

...

Keyword(s):

Genomic Selection ◽

Complex Traits ◽

Genetic Architecture ◽

Association Studies ◽

Predictive Ability ◽

Additive Models ◽

Marker Density ◽

Interaction Terms ◽

Biological Interpretation ◽

Genetic Value

Abstract Genomic selection uses whole-genome marker models to predict phenotypes or genetic values for complex traits. Some of these models fit interaction terms between markers, and are therefore called epistatic. The biological interpretation of the corresponding fitted effects is not straightforward and there is the threat of overinterpreting their functional meaning. Here we show that the predictive ability of epistatic models relative to additive models can change with the density of the marker panel. In more detail, we show that for publicly available Arabidopsis and rice datasets, an initial superiority of epistatic models over additive models, which can be observed at a lower marker density, vanishes when the number of markers increases. We relate these observations to earlier results reported in the context of association studies which showed that detecting statistical epistatic effects may not only be related to interactions in the underlying genetic architecture, but also to incomplete linkage disequilibrium at low marker density (“Phantom Epistasis”). Finally, we illustrate in a simulation study that due to phantom epistasis, epistatic models may also predict the genetic value of an underlying purely additive genetic architecture better than additive models, when the marker density is low. Our observations can encourage the use of genomic epistatic models with low density panels, and discourage their biological over-interpretation.

Download Full-text

Sept8/SEPTIN8 involvement in cellular structure and kidney damage is identified by genetic mapping and a novel human tubule hypoxic model

Scientific Reports ◽

10.1038/s41598-021-81550-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Gregory R. Keele ◽

Jeremy W. Prokop ◽

Hong He ◽

Katie Holl ◽

John Littrell ◽

...

Keyword(s):

Complex Traits ◽

Genetic Model ◽

Association Studies ◽

Model Systems ◽

Linear Mixed Effect Model ◽

Genome Wide Association Studies ◽

Tubulointerstitial Injury ◽

Heritable Variation ◽

Mixed Effect

AbstractChronic kidney disease (CKD), which can ultimately progress to kidney failure, is influenced by genetics and the environment. Genes identified in human genome wide association studies (GWAS) explain only a small proportion of the heritable variation and lack functional validation, indicating the need for additional model systems. Outbred heterogeneous stock (HS) rats have been used for genetic fine-mapping of complex traits, but have not previously been used for CKD traits. We performed GWAS for urinary protein excretion (UPE) and CKD related serum biochemistries in 245 male HS rats. Quantitative trait loci (QTL) were identified using a linear mixed effect model that tested for association with imputed genotypes. Candidate genes were identified using bioinformatics tools and targeted RNAseq followed by testing in a novel in vitro model of human tubule, hypoxia-induced damage. We identified two QTL for UPE and five for serum biochemistries. Protein modeling identified a missense variant within Septin 8 (Sept8) as a candidate for UPE. Sept8/SEPTIN8 expression increased in HS rats with elevated UPE and tubulointerstitial injury and in the in vitro hypoxia model. SEPTIN8 is detected within proximal tubule cells in human kidney samples and localizes with acetyl-alpha tubulin in the culture system. After hypoxia, SEPTIN8 staining becomes diffuse and appears to relocalize with actin. These data suggest a role of SEPTIN8 in cellular organization and structure in response to environmental stress. This study demonstrates that integration of a rat genetic model with an environmentally induced tubule damage system identifies Sept8/SEPTIN8 and informs novel aspects of the complex gene by environmental interactions contributing to CKD risk.

Download Full-text

Genetics of complex traits: prediction of phenotype, identification of causal polymorphisms and genetic architecture

Proceedings of The Royal Society B Biological Sciences ◽

10.1098/rspb.2016.0569 ◽

2016 ◽

Vol 283 (1835) ◽

pp. 20160569 ◽

Cited By ~ 52

Author(s):

M. E. Goddard ◽

K. E. Kemper ◽

I. M. MacLeod ◽

A. J. Chamberlain ◽

B. J. Hayes

Keyword(s):

Complex Traits ◽

Genetic Architecture ◽

Quantitative Traits ◽

Association Studies ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Crop Breeding ◽

Single Nucleotide ◽

Genome Wide ◽

Phenotype Identification

Complex or quantitative traits are important in medicine, agriculture and evolution, yet, until recently, few of the polymorphisms that cause variation in these traits were known. Genome-wide association studies (GWAS), based on the ability to assay thousands of single nucleotide polymorphisms (SNPs), have revolutionized our understanding of the genetics of complex traits. We advocate the analysis of GWAS data by a statistical method that fits all SNP effects simultaneously, assuming that these effects are drawn from a prior distribution. We illustrate how this method can be used to predict future phenotypes, to map and identify the causal mutations, and to study the genetic architecture of complex traits. The genetic architecture of complex traits is even more complex than previously thought: in almost every trait studied there are thousands of polymorphisms that explain genetic variation. Methods of predicting future phenotypes, collectively known as genomic selection or genomic prediction, have been widely adopted in livestock and crop breeding, leading to increased rates of genetic improvement.

Download Full-text

RIL-StEp: epistasis analysis of rice recombinant inbred lines (RILs) reveals candidate interacting genes that control seed hull color and leaf chlorophyll content

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab130 ◽

2021 ◽

Author(s):

Toshiyuki Sakai ◽

Akira Abe ◽

Motoki Shimizu ◽

Ryohei Terauchi

Keyword(s):

Chlorophyll Content ◽

Recombinant Inbred ◽

Complex Traits ◽

Genetic Architecture ◽

Recombinant Inbred Lines ◽

Inbred Lines ◽

Gene Interactions ◽

Leaf Chlorophyll Content ◽

Leaf Chlorophyll ◽

Seed Hull

Abstract Characterizing epistatic gene interactions is fundamental for understanding the genetic architecture of complex traits. However, due to the large number of potential gene combinations, detecting epistatic gene interactions is computationally demanding. A simple, easy-to-perform method for sensitive detection of epistasis is required. Due to their homozygous nature, use of recombinant inbred lines (RILs) excludes the dominance effect of alleles and interactions involving heterozygous genotypes, thereby allowing detection of epistasis in a simple and interpretable model. Here, we present an approach called RIL-StEp (recombinant inbred lines stepwise epistasis detection) to detect epistasis using single nucleotide polymorphisms in the genome. We applied the method to reveal epistasis affecting rice (Oryza sativa) seed hull color and leaf chlorophyll content and successfully identified pairs of genomic regions that presumably control these phenotypes. This method has the potential to improve our understanding of the genetic architecture of various traits of crops and other organisms.

Download Full-text

The utility of a closed breeding colony of Peromyscus leucopus for dissecting complex traits

10.1101/2021.08.14.456359 ◽

2021 ◽

Author(s):

Anthony D Long ◽

Alan Barbour ◽

Phillip N Long ◽

Vanessa J Cook ◽

Arundhati Majumder

Keyword(s):

Complex Traits ◽

Genetic Model ◽

Peromyscus Leucopus ◽

False Positive Rate ◽

Model System ◽

Sequencing Data ◽

Power Estimate ◽

Genome Wide ◽

Low Pass ◽

Closed Colony

Although Peromyscus leucopus (deermouse) is not considered a genetic model system, its genus is well suited for addressing several questions of biologist interest, including the genetic bases of longevity, behavior, physiology, adaptation, and its ability to serve as a disease vector. Here we explore a diversity outbred approach for dissecting complex traits in Peromyscus leucopus, a non-traditional genetic model system. We take advantage of a closed colony of deer-mice founded from 38 individuals between 1982 and 1985 and subsequently maintained for 35+ years (~40-60 generations). From 405 low-pass (~1X) short-read sequenced deermice we accurately imputed genotypes at 17,751,882 SNPs. Conditional on observed genotypes for a subset of 297 individuals, simulations were conducted in which a QTL contributes 5% to a complex trait under three different genetic models. The power of either a haplotype- or marker-based statistical test was estimated to be 15-25% to detect the hidden QTL. Although modest, this power estimate is consistent with that of DO/HS mice and rat experiments for an experiment with ~300 individuals. This limitation in QTL detection is mostly associated with the stringent significance threshold required to hold the genome-wide false positive rate low, as in all cases we observe considerable linkage signal at the location of simulated QTL, suggesting a larger panel would exhibit greater power. For the subset of cases where a QTL was detected, localization ability appeared very desirable at ~1-2Mb. We finally carried out a GWAS on a demonstration trait, bleeding time. No tests exceeded the threshold for genome-wide significance, but one of four suggestive regions co-localizes with Von Willebrand factor. Our work suggests that complex traits can be dissected in founders-unknown P. leucopus colony mice in much the same manner as founders-known DO/HS mice and rats, with genotypes obtained from low pass sequencing data. Our results further suggest that the DO/HS approach can be powerfully extended to any system in which a founders-unknown closed colony has been maintained for several dozen generations.

Download Full-text

Better estimation of SNP heritability from summary statistics provides a new understanding of the genetic architecture of complex traits

10.1101/284976 ◽

2018 ◽

Cited By ~ 6

Author(s):

Doug Speed ◽

David J Balding

Keyword(s):

Complex Traits ◽

Genetic Architecture ◽

Large Scale ◽

Association Studies ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Confounding Bias ◽

Conserved Regions ◽

Genome Wide ◽

Variation Explained

LD Score Regression (LDSC) has been widely applied to the results of genome-wide association studies. However, its estimates of SNP heritability are derived from an unrealistic model in which each SNP is expected to contribute equal heritability. As a consequence, LDSC tends to over-estimate confounding bias, under-estimate the total phenotypic variation explained by SNPs, and provide misleading estimates of the heritability enrichment of SNP categories. Therefore, we present SumHer, software for estimating SNP heritability from summary statistics using more realistic heritability models. After demonstrating its superiority over LDSC, we apply SumHer to the results of 24 large-scale association studies (average sample size 121 000). First we show that these studies have tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci has under-reported by about 20%. Next we estimate enrichment for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further twelve categories with above 2-fold enrichment. By contrast, our analysis using SumHer finds that conserved regions are only 1.6-fold (SD 0.06) enriched, and that no category has enrichment above 1.7-fold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.

Download Full-text

On Using Local Ancestry to Characterize the Genetic Architecture of Human Phenotypes: Genetic Regulation of Gene Expression in Multiethnic or Admixed Populations as a Model

10.1101/483107 ◽

2018 ◽

Cited By ~ 1

Author(s):

Yizhen Zhong ◽

Minoli Perera ◽

Eric R. Gamazon

Keyword(s):

Gene Expression ◽

Complex Traits ◽

Genetic Architecture ◽

Genetic Regulation ◽

Regulation Of Gene Expression ◽

Type I ◽

Eqtl Mapping ◽

Entire Genome ◽

Local Ancestry ◽

Heritability Estimation

AbstractBackgroundUnderstanding the nature of the genetic regulation of gene expression promises to advance our understanding of the genetic basis of disease. However, the methodological impact of use of local ancestry on high-dimensional omics analyses, including most prominently expression quantitative trait loci (eQTL) mapping and trait heritability estimation, in admixed populations remains critically underexplored.ResultsHere we develop a statistical framework that characterizes the relationships among the determinants of the genetic architecture of an important class of molecular traits. We estimate the trait variance explained by ancestry using local admixture relatedness between individuals. Using National Institute of General Medical Sciences (NIGMS) and Genotype-Tissue Expression (GTEx) datasets, we show that use of local ancestry can substantially improve eQTL mapping and heritability estimation and characterize the sparse versus polygenic component of gene expression in admixed and multiethnic populations respectively. Using simulations of diverse genetic architectures to estimate trait heritability and the level of confounding, we show improved accuracy given individual-level data and evaluate a summary statistics based approach. Furthermore, we provide a computationally efficient approach to local ancestry analysis in eQTL mapping while increasing control of type I and type II error over traditional approaches.ConclusionOur study has important methodological implications on genetic analysis of omics traits across a range of genomic contexts, from a single variant to a prioritized region to the entire genome. Our findings highlight the importance of using local ancestry to better characterize the heritability of complex traits and to more accurately map genetic associations.

Download Full-text

TIGAR: An Improved Bayesian Tool for Transcriptomic Data Imputation Enhances Gene Mapping of Complex Traits

10.1101/507525 ◽

2018 ◽

Cited By ~ 3

Author(s):

Sini Nagpal ◽

Xiaoran Meng ◽

Michael P. Epstein ◽

Lam C. Tsoi ◽

Matthew Patrick ◽

...

Keyword(s):

Gene Expression ◽

Complex Traits ◽

Bayesian Model ◽

Genetic Architecture ◽

Bayesian Method ◽

Association Studies ◽

Gwas Data ◽

Nonparametric Bayesian ◽

Transcriptomic Data ◽

Special Cases

AbstractThe transcriptome-wide association studies (TWAS) that test for association between the study trait and the imputed gene expression levels from cis-acting expression quantitative trait loci (cis-eQTL) genotypes have successfully enhanced the discovery of genetic risk loci for complex traits. By using the gene expression imputation models fitted from reference datasets that have both genetic and transcriptomic data, TWAS facilitates gene-based tests with GWAS data while accounting for the reference transcriptomic data. The existing TWAS tools like PrediXcan and FUSION use parametric imputation models that have limitations for modeling the complex genetic architecture of transcriptomic data. Therefore, we propose an improved Bayesian method that assumes a data-driven nonparametric prior to impute gene expression. Our method is general and flexible and includes both the parametric imputation models used by PrediXcan and FUSION as special cases. Our simulation studies showed that the nonparametric Bayesian model improved both imputation R2 for transcriptomic data and the TWAS power over PrediXcan. In real applications, our nonparametric Bayesian method fitted transcriptomic imputation models for 2X number of genes with 1.7X average regression R2 over PrediXcan, thus improving the power of follow-up TWAS. Hence, the nonparametric Bayesian model is preferred for modeling the complex genetic architecture of transcriptomes and is expected to enhance transcriptome-integrated genetic association studies. We implement our Bayesian approach in a convenient software tool “TIGAR” (Transcriptome-Integrated Genetic Association Resource), which imputes transcriptomic data and performs subsequent TWAS using individual-level or summary-level GWAS data.

Download Full-text

Bayesian Model Choice and Search Strategies for Mapping Interacting Quantitative Trait Loci

Genetics ◽

10.1093/genetics/165.2.867 ◽

2003 ◽

Vol 165 (2) ◽

pp. 867-883 ◽

Cited By ~ 19

Author(s):

Nengjun Yi ◽

Shizhong Xu ◽

David B Allison

Keyword(s):

Quantitative Trait Loci ◽

Quantitative Trait ◽

Complex Traits ◽

Bayesian Model ◽

Genetic Model ◽

Genetic Effects ◽

Monte Carlo Algorithm ◽

Data Set ◽

Epistatic Effects ◽

Trait Loci

AbstractMost complex traits of animals, plants, and humans are influenced by multiple genetic and environmental factors. Interactions among multiple genes play fundamental roles in the genetic control and evolution of complex traits. Statistical modeling of interaction effects in quantitative trait loci (QTL) analysis must accommodate a very large number of potential genetic effects, which presents a major challenge to determining the genetic model with respect to the number of QTL, their positions, and their genetic effects. In this study, we use the methodology of Bayesian model and variable selection to develop strategies for identifying multiple QTL with complex epistatic patterns in experimental designs with two segregating genotypes. Specifically, we develop a reversible jump Markov chain Monte Carlo algorithm to determine the number of QTL and to select main and epistatic effects. With the proposed method, we can jointly infer the genetic model of a complex trait and the associated genetic parameters, including the number, positions, and main and epistatic effects of the identified QTL. Our method can map a large number of QTL with any combination of main and epistatic effects. Utility and flexibility of the method are demonstrated using both simulated data and a real data set. Sensitivity of posterior inference to prior specifications of the number and genetic effects of QTL is investigated.

Download Full-text