A high-quality reference panel reveals the complexity and distribution of structural genome changes in a human population

Mapping Intimacies ◽

10.1101/036897 ◽

2016 ◽

Cited By ~ 2

Author(s):

Jayne Y. Hehir-Kwa ◽

Tobias Marschall ◽

Wigard P. Kloosterman ◽

Laurent C. Francioli ◽

Jasmijn A. Baaijens ◽

...

Keyword(s):

Association Studies ◽

Whole Genome Sequencing Data ◽

Strong Linkage Disequilibrium ◽

Genome Wide Association Studies ◽

Sequencing Data ◽

Full Spectrum ◽

Disease Phenotypes ◽

Human Genomes ◽

Genome Wide ◽

Genome Variants

AbstractStructural variation (SV) represents a major source of differences between individual human genomes and has been linked to disease phenotypes. However, the majority of studies provide neither a global view of the full spectrum of these variants nor integrate them into reference panels of genetic variation.Here, we analyse whole genome sequencing data of 769 individuals from 250 Dutch families, and provide a haplotype-resolved map of 1.9 million genome variants across 9 different variant classes, including novel forms of complex indels, and retrotransposition-mediated insertions of mobile elements and processed RNAs. A large proportion are previously under reported variants sized between 21 and 100bp. We detect 4 megabases of novel sequence, encoding 11 new transcripts. Finally, we show 191 known, trait-associated SNPs to be in strong linkage disequilibrium with SVs and demonstrate that our panel facilitates accurate imputation of SVs in unrelated individuals. Our findings are essential for genome-wide association studies.

Download Full-text

Quantifying the mapping precision of genome-wide association studies using whole-genome sequencing data

Genome Biology ◽

10.1186/s13059-017-1216-0 ◽

2017 ◽

Vol 18 (1) ◽

Cited By ~ 46

Author(s):

Yang Wu ◽

Zhili Zheng ◽

Peter M. Visscher ◽

Jian Yang

Keyword(s):

Whole Genome Sequencing ◽

Genome Sequencing ◽

Association Studies ◽

Genome Wide Association ◽

Whole Genome Sequencing Data ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Sequencing Data ◽

Genome Wide

Download Full-text

Genome-wide discovery of epistatic loci affecting antibiotic resistance using evolutionary couplings

10.1101/325993 ◽

2018 ◽

Author(s):

Benjamin Schubert ◽

Rohan Maddamsetti ◽

Jackson Nyman ◽

Maha R. Farhat ◽

Debora S. Marks

Keyword(s):

Antibiotic Resistance ◽

Statistical Power ◽

Genome Wide Association Study ◽

Association Studies ◽

Genome Wide Association ◽

Whole Genome Sequencing Data ◽

Genome Wide Association Studies ◽

Sequencing Data ◽

Vast Number ◽

Genome Wide

ABSTRACTThe analysis of whole genome sequencing data should, in theory, allow the discovery of interdependent loci that cause antibiotic resistance. In practice, however, identifying this epistasis remains a challenge as the vast number of possible interactions erodes statistical power. To solve this problem, we extend a method that has been successfully used to identify epistatic residues in proteins to infer genomic loci that are strongly coupled and associated with antibiotic resistance. Our method reduces the number of tests required for an epistatic genome-wide association study and increases the likelihood of identifying causal epistasis. We discovered 38 loci and 250 epistatic pairs that influence the dose needed to inhibit growth for five different antibiotics in 1,102 isolates of Neisseria gonorrhoeae that were confirmed in an independent dataset of 495 isolates. Many known resistance-affecting loci were recovered; however, the majority of loci occurred in unreported genes, including murE which was associated with cefixime. About half of the novel epistasis we report involved at least one locus previously associated with antibiotic resistance, including interactions between gyrA and parC associated with ciprofloxacin. Still, many combinations involved unreported loci and genes. Our work provides a systematic identification of epistasis pairs affecting antibiotic resistance in N. gonorrhoeae and a generalizable method for epistatic genome-wide association studies.

Download Full-text

Current analysis platforms and methods for detecting copy number variation

Physiological Genomics ◽

10.1152/physiolgenomics.00082.2012 ◽

2013 ◽

Vol 45 (1) ◽

pp. 1-16 ◽

Cited By ~ 45

Author(s):

Wenli Li ◽

Michael Olivier

Keyword(s):

Copy Number Variation ◽

Copy Number ◽

Association Studies ◽

Genome Wide Association Studies ◽

Genomic Structural Variation ◽

Disease Phenotypes ◽

Human Genomes ◽

Genome Wide ◽

Number Variation ◽

Genomic Regions

Copy number variation (CNV), generated through duplication or deletion events that affect one or more loci, is widespread in the human genomes and is often associated with functional consequences that may include changes in gene expression levels or fusion of genes. Genome-wide association studies indicate that some disease phenotypes and physiological pathways might be impacted by CNV in a small number of characterized genomic regions. However, the pervasiveness and full impact of such variation remains unclear. Suitable analytic methods are needed to thoroughly mine human genomes for genomic structural variation, and to explore the interplay between observed CNV and disease phenotypes, but many medical researchers are unfamiliar with the features and nuances of recently developed technologies for detecting CNV. In this article, we evaluate a suite of commonly used and recently developed approaches to uncovering genome-wide CNVs and discuss the relative merits of each.

Download Full-text

The MIR137 VNTR rs58335419 Is Associated With Cognitive Impairment in Schizophrenia and Altered Cortical Morphology

Schizophrenia Bulletin ◽

10.1093/schbul/sbaa123 ◽

2020 ◽

Author(s):

Ebrahim Mahmoudi ◽

Joshua R Atkins ◽

Yann Quidé ◽

William R Reay ◽

Heath M Cairns ◽

...

Keyword(s):

Association Studies ◽

Variable Number Tandem Repeat ◽

Variable Number ◽

Brain Morphology ◽

Whole Genome Sequencing Data ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Sequencing Data ◽

Genome Wide ◽

Cortical Morphology

Abstract Genome-wide association studies (GWAS) of schizophrenia have strongly implicated a risk locus in close proximity to the gene for miR-137. While there are candidate single-nucleotide polymorphisms (SNPs) with functional implications for the microRNA’s expression encompassed by the common haplotype tagged by rs1625579, there are likely to be others, such as the variable number tandem repeat (VNTR) variant rs58335419, that have no proxy on the SNP genotyping platforms used in GWAS to date. Using whole-genome sequencing data from schizophrenia patients (n = 299) and healthy controls (n = 131), we observed that the MIR137 4-repeats VNTR (VNTR4) variant was enriched in a cognitive deficit subtype of schizophrenia and associated with altered brain morphology, including thicker left inferior temporal gyrus and deeper right postcentral sulcus. These findings suggest that the MIR137 VNTR4 may impact neuroanatomical development that may, in turn, influence the expression of more severe cognitive symptoms in patients with schizophrenia.

Download Full-text

Elevated plasma levels of CXCL16 in severe COVID-19 patients

10.1101/2021.09.07.21263222 ◽

2021 ◽

Author(s):

Sandra P. Smieszek ◽

Vasilios M. Polymeropoulos ◽

Christos M. Polymeropoulos ◽

Bartlomiej P. Przychodzen ◽

Gunther Birznieks ◽

...

Keyword(s):

Plasma Levels ◽

Association Studies ◽

Plasma Concentrations ◽

Hospitalized Patients ◽

Whole Genome Sequencing Data ◽

Genome Wide Association Studies ◽

Sequencing Data ◽

Genome Wide ◽

Targeted Treatments ◽

Severe Manifestation

AbstractGenome-wide association studies have recently identified 3p21.31, with lead variant pointing to the CXCR6 gene, as the strongest thus far reported susceptibility risk locus for severe manifestation of COVID-19. In order the determine its role, we measured plasma levels of Chemokine (C□X□C motif) ligand 16 (CXCL16) in the plasma of COVID-19 hospitalized patients. CXCL16 interacts with CXCR6 promoting chemotaxis or cell adhesion. The CXCR6/CXCL16 axis mediates homing of T cells to the lungs in disease and hyper-expression is associated with localised cellular injury. To characterize the CXCR6/CXCL16 axis in the pathogenesis of severe COVID-19, plasma concentrations of CXCL16 collected at baseline from 115 hospitalized COVID-19 patients participating in ODYSSEY COVID-19 clinical trial were assessed together with a set of controls. We report elevated levels of CXCL16 in a cohort of COVID-19 hospitalized patients. Specifically, we report significant elevation of CXCL16 plasma levels in association with severity of COVID-19 (as defined by WHO scale) (P-value<0.02). Our current study is the largest thus far study reporting CXCL16 levels in COVID-19 hospitalized patients (with whole-genome sequencing data available). The results further support the significant role of the CXCR6/CXCL16 axis in the immunopathogenesis of severe COVID-19 and warrants further studies to understand which patients would benefit most from targeted treatments.

Download Full-text

Identification of deleterious and regulatory genomic variations in known asthma loci

10.1101/389031 ◽

2018 ◽

Author(s):

Matthew D. C. Neville ◽

Jihoon Choi ◽

Jonathan Lieberman ◽

Qing Ling Duan

Keyword(s):

Candidate Gene ◽

Association Studies ◽

Genome Wide Association ◽

Whole Genome Sequencing Data ◽

Sequence Variants ◽

Genome Wide Association Studies ◽

Sequencing Data ◽

Regulatory Variants ◽

Genome Wide ◽

Regulatory Effects

AbstractBackgroundCandidate gene and genome-wide association studies have identified hundreds of asthma risk loci. The majority of associated variants, however, are not known to have any biological function and are believed to represent markers rather than true causative mutations. We hypothesized that many of these associated markers are in linkage disequilibrium (LD) with the elusive causative variants.MethodsWe compiled a comprehensive list of 447 asthma-associated variants previously reported in candidate gene and genome-wide association studies. Next, we identified all sequence variants located within the 304 unique genes using whole-genome sequencing data from the 1000 Genomes Project. Then, we calculated the LD between known asthma variants and the sequence variants within each gene. LD variants identified were then annotated to determine those that are potentially deleterious and/or functional (i.e. coding or regulatory effects on the encoded transcript or protein).ResultsWe identified 10,048 variants in LD (r2 > 0.6) with known asthma variants. Annotations of these LD variants revealed that several have potentially deleterious effects including frameshift, alternate splice site, stop-lost, and missense. Moreover, 24 of the LD variants have been reported to regulate gene expression as expression quantitative trait loci (eQTLs).ConclusionsThis study is proof of concept that many of the genetic loci previously associated with complex diseases such as asthma are not causative but represent markers of disease, which are in LD with the elusive causative variants. We hereby report a number of potentially deleterious and regulatory variants that are in LD with the reported asthma loci. These reported LD variants could account for the original association signals with asthma and represent the true causative mutations at these loci.

Download Full-text

HAPPI GWAS: Holistic Analysis with Pre and Post Integration GWAS

10.1101/2020.04.07.998690 ◽

2020 ◽

Cited By ~ 2

Author(s):

Marianne L. Slaten ◽

Yen On Chan ◽

Vivek Shrestha ◽

Alexander E. Lipka ◽

Ruthie Angelovici

Keyword(s):

Association Studies ◽

Phenotypic Traits ◽

Genome Wide Association Studies ◽

Sequencing Data ◽

Gwas Analysis ◽

Genome Wide ◽

Large Populations ◽

Unbiased Estimates ◽

Best Linear Unbiased ◽

Automated Pipeline

AbstractMotivationAdvanced publicly available sequencing data from large populations have enabled in-formative genome-wide association studies (GWAS) that associate SNPs with phenotypic traits of interest. Many publicly available tools able to perform GWAS have been developed in response to increased demand. However, these tools lack a comprehensive pipeline that includes both pre-GWAS analysis such as outlier removal, data transformation, and calculation of Best Linear Unbiased Predictions (BLUPs) or Best Linear Unbiased Estimates (BLUEs). In addition, post-GWAS analysis such as haploblock analysis and candidate gene identification are lacking.ResultsHere, we present HAPPI GWAS, an open-source GWAS tool able to perform pre-GWAS, GWAS, and post-GWAS analysis in an automated pipeline using the command-line interface.AvailabilityHAPPI GWAS is written in R for any Unix-like operating systems and is available on GitHub (https://github.com/Angelovici-Lab/HAPPI.GWAS.git)[email protected]

Download Full-text

Genes identified through genome-wide association studies of osteonecrosis in childhood acute lymphoblastic leukemia patients

Pharmacogenomics ◽

10.2217/pgs-2019-0087 ◽

2019 ◽

Vol 20 (17) ◽

pp. 1189-1197 ◽

Cited By ~ 1

Author(s):

Vincent Gagné ◽

Anne Aubry-Morin ◽

Maria Plesa ◽

Rachid Abaji ◽

Kateryna Petrykey ◽

...

Keyword(s):

Association Studies ◽

Lymphoblastic Leukemia ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Sequencing Data ◽

Childhood All ◽

Exome Sequencing Data ◽

Genome Wide ◽

Whole Exome ◽

Whole Exome Sequencing Data

Aim: To evaluate top-ranking genes identified through genome-wide association studies for an association with corticosteroid-related osteonecrosis in children with acute lymphoblastic leukemia (ALL) who received Dana–Farber Cancer Institute treatment protocols. Patients & methods: Lead SNPs from these studies, as well as other variants in the same genes, pooled from whole exome sequencing data, were analyzed for an association with osteonecrosis in childhood ALL patients from Quebec cohort. Top-ranking variants were verified in the replication patient group. Results: The analyses of variants in the ACP1-SH3YL1 locus derived from whole exome sequencing data showed an association of several correlated SNPs (rs11553746, rs2290911, rs7595075, rs2306060 and rs79716074). The rs79716074 defines *B haplotype of the APC1 gene, which is well known for its functional role. Conclusion: This study confirms implication of the ACP1 gene in the treatment-related osteonecrosis in childhood ALL and identifies novel, potentially causal variant of this complication.

Download Full-text

HAPPI GWAS: Holistic Analysis with Pre- and Post-Integration GWAS

Bioinformatics ◽

10.1093/bioinformatics/btaa589 ◽

2020 ◽

Vol 36 (17) ◽

pp. 4655-4657

Author(s):

Marianne L Slaten ◽

Yen On Chan ◽

Vivek Shrestha ◽

Alexander E Lipka ◽

Ruthie Angelovici

Keyword(s):

Association Studies ◽

Supplementary Information ◽

Phenotypic Traits ◽

Genome Wide Association Studies ◽

Sequencing Data ◽

Gwas Analysis ◽

Genome Wide ◽

Large Populations ◽

Best Linear Unbiased ◽

Holistic Analysis

Abstract Motivation Advanced publicly available sequencing data from large populations have enabled informative genome-wide association studies (GWAS) that associate SNPs with phenotypic traits of interest. Many publicly available tools able to perform GWAS have been developed in response to increased demand. However, these tools lack a comprehensive pipeline that includes both pre-GWAS analysis, such as outlier removal, data transformation and calculation of Best Linear Unbiased Predictions or Best Linear Unbiased Estimates. In addition, post-GWAS analysis, such as haploblock analysis and candidate gene identification, is lacking. Results Here, we present Holistic Analysis with Pre- and Post-Integration (HAPPI) GWAS, an open-source GWAS tool able to perform pre-GWAS, GWAS and post-GWAS analysis in an automated pipeline using the command-line interface. Availability and implementation HAPPI GWAS is written in R for any Unix-like operating systems and is available on GitHub (https://github.com/Angelovici-Lab/HAPPI.GWAS.git). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

The use of genomic data and imputation methods in dairy cattle breeding

Czech Journal of Animal Science ◽

10.17221/83/2020-cjas ◽

2020 ◽

Vol 65 (No. 12) ◽

pp. 445-453

Author(s):

Anita Klímová ◽

Eva Kašná ◽

Karolína Machová ◽

Michaela Brzáková ◽

Josef Přibyl ◽

...

Keyword(s):

Association Studies ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Sequencing Data ◽

Phenotypic Data ◽

Single Nucleotide ◽

Imputation Methods ◽

Genome Wide ◽

Dairy Cattle Breeding ◽

Combine Information

The inclusion of animal genotype data has contributed to the development of genomic selection. Animals are selected not only based on pedigree and phenotypic data but also on the basis of information about their genotypes. Genomic information helps to increase the accuracy of selection of young animals and thus enables a reduction of the generation interval. Obtaining information about genotypes in the form of SNPs (single nucleotide polymorphisms) has led to the development of new chips for genotyping. Several methods of genomic comparison have been developed as a result. One of the methods is data imputation, which allows the missing SNPs to be calculated using low-density chips to high-density chips. Through imputations, it is possible to combine information from diverse sets of chips and thus obtain more information about genotypes at a lower cost. Increasing the amount of data helps increase the reliability of predicting genomic breeding values. Imputation methods are increasingly used in genome-wide association studies. When classical genotyping and genome-wide sequencing data are combined, this option helps to increase the chances of identifying loci that are associated with economically significant traits.

Download Full-text