A model of compound heterozygous, loss-of-function alleles is broadly consistent with observations from complex-disease GWAS datasets

Mapping Intimacies ◽

10.1101/048819 ◽

2016 ◽

Cited By ~ 2

Author(s):

Jaleal S. Sanjak ◽

Anthony D. Long ◽

Kevin R. Thornton

Keyword(s):

Genetic Variation ◽

Allele Frequency ◽

Population Genetic ◽

Complex Disease ◽

Genetic Model ◽

Disease Risk ◽

Association Studies ◽

Allele Frequency Distribution ◽

Genome Wide Association Studies ◽

Heritability Estimation

AbstractThe genetic component of complex disease risk in humans remains largely unexplained. A corollary is that the allelic spectrum of genetic variants contributing to complex disease risk is unknown. Theoretical models that relate population genetic processes to the maintenance of genetic variation for quantitative traits may suggest profitable avenues for future experimental design. Here we use forward simulation to model a genomic region evolving under a balance between recurrent deleterious mutation and Gaussian stabilizing selection. We consider multiple genetic and demographic models, and several different methods for identifying genomic regions harboring variants associated with complex disease risk. We demonstrate that the model of gene action, relating genotype to phenotype, has a qualitative effect on several relevant aspects of the population genetic architecture of a complex trait. In particular, the genetic model impacts genetic variance component partitioning across the allele frequency spectrum and the power of statistical tests. Models with partial recessivity closely match the minor allele frequency distribution of significant hits from empirical genome-wide association studies without requiring homozygous effect-sizes to be small. We highlight a particular gene-based model of incomplete recessivity that is appealing from first principles. Under that model, deleterious mutations in a genomic region partially fail to complement one another. This model of gene-based recessivity predicts the empirically observed inconsistency between twin and SNP based estimated of dominance heritability. Furthermore, this model predicts considerable levels of unexplained variance associated with intralocus epistasis. Our results suggest a need for improved statistical tools for region based genetic association and heritability estimation.Author SummaryGene action determines how mutations affect phenotype. When placed in an evolutionary context, the details of the genotype-to-phenotype model can impact the maintenance of genetic variation for complex traits. Likewise, non-equilibrium demographic history may affect patterns of genetic variation. Here, we explore the impact of genetic model and population growth on distribution of genetic variance across the allele frequency spectrum underlying risk for a complex disease. Using forward-in-time population genetic simulations, we show that the genetic model has important impacts on the composition of variation for complex disease risk in a population. We explicitly simulate genome-wide association studies (GWAS) and perform heritability estimation on population samples. A particular model of gene-based partial recessivity, based on allelic non-complementation, aligns well with empirical results. This model is congruent with the dominance variance estimates from both SNPs and twins, and the minor allele frequency distribution of GWAS hits.

Download Full-text

Analysis of chromatin organization and gene expression in T cells identifies functional genes for rheumatoid arthritis

10.1101/827923 ◽

2019 ◽

Author(s):

Jing Yang ◽

Amanda McGovern ◽

Paul Martin ◽

Kate Duffus ◽

Xiangyu Ge ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Gene Expression ◽

T Cells ◽

Complex Disease ◽

Target Genes ◽

Disease Risk ◽

Association Studies ◽

Dna Interaction ◽

Genome Wide Association Studies ◽

Causal Genes

AbstractGenome-wide association studies have identified genetic variation contributing to complex disease risk. However, assigning causal genes and mechanisms has been more challenging because disease-associated variants are often found in distal regulatory regions with cell-type specific behaviours. Here, we collect ATAC-seq, Hi-C, Capture Hi-C and nuclear RNA-seq data in stimulated CD4+ T-cells over 24 hours, to identify functional enhancers regulating gene expression. We characterise changes in DNA interaction and activity dynamics that correlate with changes gene expression, and find that the strongest correlations are observed within 200 kb of promoters. Using rheumatoid arthritis as an example of T-cell mediated disease, we demonstrate interactions of expression quantitative trait loci with target genes, and confirm assigned genes or show complex interactions for 20% of disease associated loci, including FOXO1, which we confirm using CRISPR/Cas9.

Download Full-text

Flaw or discovery? Calculating exact p-values for genome-wide association studies in inbred populations

10.1101/015339 ◽

2015 ◽

Author(s):

Xia Shen

Keyword(s):

Allele Frequency ◽

Frequency Distribution ◽

Minor Allele Frequency ◽

Association Studies ◽

Genome Wide Association ◽

Allele Frequency Distribution ◽

Genome Wide Association Studies ◽

P Values ◽

Genome Wide ◽

Inbred Populations

Motivation: Genome-wide association studies have been conducted in inbred populations where the sample size is small. The ordinary association p-values and multiple testing correction therefore become questionable, as the detected genetic effect may or may not be due to chance, depending on the minor allele frequency distribution across the genome. Instead of permutation testing, marker-specific false positive rate can be analytically calculated in inbred populations without heterozygotes. Results: Solutions of exact p-values for genome-wide association studies in inbred populations were derived and implemented. An example is presented to illustrate that the marker-specific experiment-wise p-value varies as the genome-wide minor allele frequency distribution changes. A simulation using real Arabidopsis thaliana genome indicates that the use of exact p-values improves detection power and reduces inflation due to population structure. An analysis of a defense-related case-control phenotype using the exact p-values revealed the causal locus, where markers with higher MAFs had smaller p-values than the top variants with lower MAFs in ordinary genome-wide association analysis. Availability and Implementation: Project URL: https://r-forge.r-project.org/projects/statomics/. The R package p.exact: https://r-forge.r-project.org/R/?group_id=2030.

Download Full-text

In Search of Complex Disease Risk through Genome Wide Association Studies

Mathematics ◽

10.3390/math9233083 ◽

2021 ◽

Vol 9 (23) ◽

pp. 3083

Author(s):

Lorena Alonso ◽

Ignasi Morán ◽

Cecilia Salvoro ◽

David Torrents

Keyword(s):

Genetic Variants ◽

Complex Disease ◽

Disease Risk ◽

Association Studies ◽

Personalised Medicine ◽

Genome Wide Association ◽

Phenotypic Traits ◽

Genome Wide Association Studies ◽

Treatment Protocols ◽

Genome Wide

The identification and characterisation of genomic changes (variants) that can lead to human diseases is one of the central aims of biomedical research. The generation of catalogues of genetic variants that have an impact on specific diseases is the basis of Personalised Medicine, where diagnoses and treatment protocols are selected according to each patient’s profile. In this context, the study of complex diseases, such as Type 2 diabetes or cardiovascular alterations, is fundamental. However, these diseases result from the combination of multiple genetic and environmental factors, which makes the discovery of causal variants particularly challenging at a statistical and computational level. Genome-Wide Association Studies (GWAS), which are based on the statistical analysis of genetic variant frequencies across non-diseased and diseased individuals, have been successful in finding genetic variants that are associated to specific diseases or phenotypic traits. But GWAS methodology is limited when considering important genetic aspects of the disease and has not yet resulted in meaningful translation to clinical practice. This review presents an outlook on the study of the link between genetics and complex phenotypes. We first present an overview of the past and current statistical methods used in the field. Next, we discuss current practices and their main limitations. Finally, we describe the open challenges that remain and that might benefit greatly from further mathematical developments.

Download Full-text

Genetic Epidemiology in Latin America: Identifying Strong Genetic Proxies for Complex Disease Risk Factors

Genes ◽

10.3390/genes11050507 ◽

2020 ◽

Vol 11 (5) ◽

pp. 507

Author(s):

Carolina Bonilla ◽

Lara Novaes Baccarini

Keyword(s):

Latin America ◽

Genetic Variants ◽

Complex Disease ◽

Disease Risk ◽

Association Studies ◽

Strong Association ◽

Genome Wide Association Studies ◽

Reverse Causation ◽

Health And Wellbeing ◽

Genome Wide

Epidemiology seeks to determine the causal effects of exposures on outcomes related to the health and wellbeing of populations. Observational studies, one of the most commonly used designs in epidemiology, can be biased due to confounding and reverse causation, which makes it difficult to establish causal relationships. In recent times, genetically informed methods, like Mendelian randomization (MR), have been developed in an attempt to overcome these disadvantages. MR relies on the association of genetic variants with outcomes of interest, where the genetic variants are proxies or instruments for modifiable exposures. Because genotypes are sorted independently and at random at the time of conception, they are less prone to confounding and reverse causation. Implementation of MR depends on, among other things, a strong association of the genetic variants with the exposure, which has usually been defined via genome-wide association studies (GWAS). Because GWAS have been most often carried out in European populations, the limited identification of strong instruments in other populations poses a major problem for the application of MR in Latin America. We suggest potential solutions that can be realized with the resources at hand and others that will have to wait for increased funding and access to technology.

Download Full-text

Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk

Human Molecular Genetics ◽

10.1093/hmg/ddp295 ◽

2009 ◽

Vol 18 (18) ◽

pp. 3525-3531 ◽

Cited By ~ 214

Author(s):

David M. Evans ◽

Peter M. Visscher ◽

Naomi R. Wray

Keyword(s):

Complex Disease ◽

Disease Risk ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

Applying the Complexity of Networks to Mine Disease Risk Genes

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.556-562.5958 ◽

2014 ◽

Vol 556-562 ◽

pp. 5958-5963

Author(s):

Hua Lin ◽

Xia Hong ◽

Zhou Ping

Keyword(s):

Biological Networks ◽

Complex Disease ◽

Genetic Factors ◽

Disease Risk ◽

Association Studies ◽

Significant Risk ◽

Genome Wide Association Studies ◽

Genome Database ◽

Risk Genes ◽

Ppi Networks

Rheumatoid arthritis (RA) is a complex disease determined by multilocus genetic factors. Although genome-wide association studies have been proven to be a powerful approach to identify risk loci, the molecular regulatory mechanisms of RA are still not clearly understood. It is therefore important to consider the interplay between genetic factors and biological networks in elucidating the mechanisms of RA pathogenesis. Here, we applied the complexity of Protein-Protein Interaction (PPI) network to identify disease risk genes. First, we assigned risk SNPs to genes from UCSC genome database and mapped these genes to PPI networks. With the aid in PPI networks, gene modules were extracted and risk feature genes were identified. As a result, risk feature genes, such as CD40, PKCA, were identified as significant risk gene sets associated with RA.

Download Full-text

Estimation of linkage disequilibrium levels and allele frequency distribution in crossbred Vrindavani cattle using 50K SNP data

PLoS ONE ◽

10.1371/journal.pone.0259572 ◽

2021 ◽

Vol 16 (11) ◽

pp. e0259572

Author(s):

Akansha Singh ◽

Amit Kumar ◽

Arnav Mehrotra ◽

Karthikeyan A. ◽

Ashwni Kumar Pandey ◽

...

Keyword(s):

Linkage Disequilibrium ◽

Allele Frequency ◽

Minor Allele Frequency ◽

Association Studies ◽

Minor Allele ◽

Pairwise Distance ◽

Allele Frequency Distribution ◽

Genome Wide Association Studies ◽

Effective Population ◽

Autosomal Snps

The objective of this study was to calculate the extent and decay of linkage disequilibrium (LD) in 96 crossbred Vrindavani cattle genotyped with Bovine SNP50K Bead Chip. After filtering, 43,821 SNPs were retained for final analysis, across 2500.3 Mb of autosome. A significant percentage of SNPs was having minor allele frequency of less than 0.20. The extent of LD between autosomal SNPs up to 10 Mb apart across the genome was measured using r2 statistic. The mean r2 value was 0.43, if pairwise distance of marker was less than10 kb and it decreased further to 0.21 for 25–50 kb markers distance. Further, the effect of minor allele frequency and sample size on LD estimate was investigated. The LD value decreased with the increase in inter-marker distance, and increased with the increase of minor allelic frequency. The estimated inbreeding coefficient and effective population size were 0.04, and 46 for present generation, which indicated small and unstable population of Vrindavani cattle. These findings suggested that a denser or breed specific SNP panel would be required to cover all genome of Vrindavani cattle for genome wide association studies (GWAS).

Download Full-text

Functional non-coding SNPs in human endothelial cells fine-map vascular trait associations

10.1101/2021.08.03.454513 ◽

2021 ◽

Author(s):

Anu Toropainen ◽

Lindsey K Stolze ◽

Tiit Ord ◽

Michael B Whalen ◽

Paula Marta Torrell ◽

...

Keyword(s):

Endothelial Cells ◽

Vascular Disease ◽

Complex Disease ◽

Disease Risk ◽

Association Studies ◽

Chromatin Accessibility ◽

Genome Wide Association Studies ◽

Large Artery ◽

Human Endothelial Cells ◽

Common Complex Disease

Functional consequences of genetic variation in the non-coding human genome are difficult to ascertain despite demonstrated associations to common, complex disease traits. To elucidate properties of functional non-coding SNPs with effects in human endothelial cells (EC), we utilized molecular Quantitative Trait Locus (molQTL) analysis for transcription factor binding, chromatin accessibility, and H3K27 acetylation to nominate a set of likely functional non-coding SNPs. Together with information from genome-wide association studies for vascular disease traits, we tested the ability of 34,344 variants to perturb enhancer function in ECs using the highly multiplexed STARR-seq assay. Of these, 5,592 variants validated, whose enriched attributes included: 1) mutations to TF binding motifs for ETS or AP1 that are regulators of EC state, 2) location in accessible and H3K27ac-marked EC chromatin, and 3) molQTLs associations whereby alleles associate with differences in chromatin accessibility and TF binding across genetically diverse ECs. Next, using pro-inflammatory IL1B as an activator of cell state, we observed robust evidence (>50%) of context-specific SNP effects, underscoring the prevalence of non-coding gene-by-environment (GxE) effects. Lastly, using these cumulative data, we fine-mapped vascular disease loci and highlight evidence suggesting mechanisms by which non-coding SNPs at two loci affect risk for Pulse Pressure/Large Artery Stroke, and Abdominal Aortic Aneurysm through respective effects on transcriptional regulation of POU4F1 and LDAH. Together, we highlight the attributes and context dependence of functional non-coding SNPs, and provide new mechanisms underlying vascular disease risk.

Download Full-text

A Genetic Map of the Modern Urban Society of Amsterdam

Frontiers in Genetics ◽

10.3389/fgene.2021.727269 ◽

2021 ◽

Vol 12 ◽

Author(s):

Bart Ferwerda ◽

Abdel Abdellaoui ◽

Max Nieuwdorp ◽

Koos Zwinderman

Keyword(s):

Genetic Variation ◽

Complex Traits ◽

Disease Risk ◽

Association Studies ◽

Urban Setting ◽

European Ancestry ◽

Joint Analysis ◽

Genome Wide Association Studies ◽

Comprehensive Overview ◽

Polygenic Scores

Genetic differences between individuals underlie susceptibility to many diseases. Genome-wide association studies (GWAS) have discovered many susceptibility genes but were often limited to cohorts of predominantly European ancestry. Genetic diversity between individuals due to different ancestries and evolutionary histories shows that this approach has limitations. In order to gain a better understanding of the associated genetic variation, we need a more global genomics approach including a greater diversity. Here, we introduce the Healthy Life in an Urban Setting (HELIUS) cohort. The HELIUS cohort consists of participants living in Amsterdam, with a level of diversity that reflects the Dutch colonial and recent migration past. The current study includes 10,283 participants with genetic data available from seven groups of inhabitants, namely, Dutch, African Surinamese, South-Asian Surinamese, Turkish, Moroccan, Ghanaian, and Javanese Surinamese. First, we describe the genetic variation and admixture within the HELIUS cohort. Second, we show the challenges during imputation when having a genetically diverse cohort. Third, we conduct a body mass index (BMI) and height GWAS where we investigate the effects of a joint analysis of the entire cohort and a meta-analysis approach for the different subgroups. Finally, we construct polygenic scores for BMI and height and compare their predictive power across the different ethnic groups. Overall, we give a comprehensive overview of a genetically diverse cohort from Amsterdam. Our study emphasizes the importance of a less biased and more realistic representation of urban populations for mapping genetic associations with complex traits and disease risk for all.

Download Full-text

Genomics of Endometriosis: From Genome Wide Association Studies to Exome Sequencing

International Journal of Molecular Sciences ◽

10.3390/ijms22147297 ◽

2021 ◽

Vol 22 (14) ◽

pp. 7297

Author(s):

Imane Lalami ◽

Carole Abo ◽

Bruno Borghese ◽

Charles Chapron ◽

Daniel Vaiman

Keyword(s):

Exome Sequencing ◽

Complex Disease ◽

Association Studies ◽

Chromosome 9 ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Replication Studies ◽

Genome Wide ◽

Heritability Estimation ◽

Functional Explanations

This review aims at better understanding the genetics of endometriosis. Endometriosis is a frequent feminine disease, affecting up to 10% of women, and characterized by pain and infertility. In the most accepted hypothesis, endometriosis is caused by the implantation of uterine tissue at ectopic abdominal places, originating from retrograde menses. Despite the obvious genetic complexity of the disease, analysis of sibs has allowed heritability estimation of endometriosis at ~50%. From 2010, large Genome Wide Association Studies (GWAS), aimed at identifying the genes and loci underlying this genetic determinism. Some of these loci were confirmed in other populations and replication studies, some new loci were also found through meta-analyses using pooled samples. For two loci on chromosomes 1 (near CCD42) and chromosome 9 (near CDKN2A), functional explanations of the SNP (Single Nucleotide Polymorphism) effects have been more thoroughly studied. While a handful of chromosome regions and genes have clearly been identified and statistically demonstrated as at-risk for the disease, only a small part of the heritability is explained (missing heritability). Some attempts of exome sequencing started to identify additional genes from families or populations, but are still scarce. The solution may reside inside a combined effort: increasing the size of the GWAS designs, better categorize the clinical forms of the disease before analyzing genome-wide polymorphisms, and generalizing exome sequencing ventures. We try here to provide a vision of what we have and what we should obtain to completely elucidate the genetics of this complex disease.

Download Full-text