Genome-wide profiling of microRNAs and prediction of mRNA targets in 17 bovine tissues

Mapping Intimacies ◽

10.1101/574954 ◽

2019 ◽

Author(s):

Min Wang ◽

Amanda J Chamberlain ◽

Claire P Prowse-Wilkins ◽

Christy J Vander Jagt ◽

Timothy P Hancock ◽

...

Keyword(s):

Complex Traits ◽

Whole Genome Sequence ◽

Messenger Rnas ◽

Mrna Targets ◽

Genome Wide ◽

Mature Microrna ◽

Target Sites ◽

Causal Variants ◽

Mature Micrornas ◽

Animal Genomes

AbstractMicroRNAs regulate many eukaryotic biological processes in a temporal- and spatial-specific manner. Yet in cattle it is not fully known which microRNAs are expressed in each tissue, which genes they regulate, or which sites a given microRNA bind to within messenger RNAs. An improved annotation of tissue-specific microRNA network may in the future assist with the identification of causal variants affecting complex traits. Here, we report findings from analysing short RNA sequence from 17 tissues from a single lactating dairy cow. Using miRDeep2, we identified 699 expressed mature microRNA sequences. Using TargetScan, known (60%) and novel (40%) microRNAs were predicted to interact with 780,481 sites in bovine messenger RNAs homologous with human. Putative interactions between microRNA families and targets were significantly enriched for interactions from previous experimental and computational identification. Characterizing features of microRNAs and targets, we showed that (1) mature microRNAs derived from different arms of the same precursor targeted different genes in different tissues; (2) miRNA target sites preferentially occurred within gene regions marked with active histone modification; (3) variants within microRNAs and targets had lower allele frequencies than variants across the genome, as identified from 65 million whole genome sequence variants; (4) no significant correlation was found between the abundance of microRNAs and messenger RNAs differentially expressed in the same tissue; (5) microRNAs and target sites weren’t significantly associated with allelic imbalance of gene targets. This study contributes to the goals of Functional Annotation of Animal Genomes consortium to improve the annotation of genomes of domestic animals.

Download Full-text

Genome-wide profiling of microRNAs and prediction of mRNA targets in 17 bovine tissues

10.21203/rs.2.9876/v1 ◽

2019 ◽

Author(s):

Min Wang ◽

Amanda J Chamberlain ◽

Claire P Prowse-Wilkins ◽

Christy J Vander Jagt ◽

Timothy P Hancock ◽

...

Keyword(s):

Complex Traits ◽

Whole Genome Sequence ◽

Messenger Rnas ◽

Mrna Targets ◽

Genome Wide ◽

Target Sites ◽

Causal Variants ◽

Mature Micrornas ◽

Temporal And Spatial ◽

Animal Genomes

Abstract Background MicroRNAs regulate many eukaryotic biological processes in a temporal- and spatial-specific manner. Yet in cattle it is not fully known which microRNAs are expressed in each tissue, which genes they regulate, or which sites a given microRNA bind to within messenger RNAs (mRNAs). An improved annotation of tissue-specific microRNA network may in the future assist with the identification of causal variants affecting complex traits. Results We report findings from analysing short RNA sequence from 17 tissues from a single lactating dairy cow. Using miRDeep2, we identified 699 expressed mature microRNAs. Using TargetScan, known (60%) and novel (40%) microRNAs were predicted to interact with 780,481 sites in bovine mRNAs homologous with human. Putative interactions between microRNA families and targets were significantly enriched for interactions from previous experimental and computational identification. Characterizing features of microRNAs and targets, we showed that (1) mature microRNAs derived from different arms of the same precursor targeted different genes in different tissues; (2) miRNA target sites preferentially occurred within gene regions undergoing active histone modification; (3) variants within microRNAs and targets had lower allele frequencies than variants across the genome, as identified from 65 million whole genome sequence variants; (4) no significant correlation was found between the abundance of microRNAs and mRNAs differentially expressed in the same tissue; (5) microRNAs and target sites weren’t significantly associated with allelic imbalance of gene targets. Conclusion This study contributes to the goals of Functional Annotation of Animal Genomes consortium to improve the annotation of genomes of domestic animals.

Download Full-text

Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits

10.1101/115527 ◽

2017 ◽

Cited By ~ 10

Author(s):

Luke M. Evans ◽

Rasool Tahmasbi ◽

Scott I. Vrieze ◽

Gonçalo R. Abecasis ◽

Sayantan Das ◽

...

Keyword(s):

Complex Traits ◽

Snp Array ◽

Causal Variant ◽

Whole Genome Sequence ◽

Whole Genome ◽

Narrow Sense Heritability ◽

Frequency Spectra ◽

Genome Wide ◽

Variant Frequency ◽

Causal Variants

ABSTRACTHeritability, h2, is a foundational concept in genetics, critical to understanding the genetic basis of complex traits. Recently-developed methods that estimate heritability from genotyped SNPs, h2SNP, explain substantially more genetic variance than genome-wide significant loci, but less than classical estimates from twins and families. However, h2SNP estimates have yet to be comprehensively compared under a range of genetic architectures, making it difficult to draw conclusions from sometimes conflicting published estimates. Here, we used thousands of real whole genome sequences to simulate realistic phenotypes under a variety of genetic architectures, including those from very rare causal variants. We compared the performance of ten methods across different types of genotypic data (commercial SNP array positions, whole genome sequence variants, and imputed variants) and under differing causal variant frequencies, levels of stratification, and relatedness thresholds. These results provide guidance in interpreting past results and choosing optimal approaches for future studies. We then chose two methods (GREML-MS and GREML-LDMS) that best estimated overall h2SNP and the causal variant frequency spectra to six phenotypes in the UK Biobank using imputed genome-wide variants. Our results suggest that as imputation reference panels become larger and more diverse, estimates of the frequency distribution of causal variants will become increasingly unbiased and the vast majority of trait narrow-sense heritability will be accounted for.

Download Full-text

Predicting causal variants affecting expression using whole genome sequence and RNA-seq from multiple human tissues

10.1101/088872 ◽

2016 ◽

Cited By ~ 2

Author(s):

Andrew Anand Brown ◽

Ana Viñuela ◽

Olivier Delaneau ◽

Tim Spector ◽

Kerrin Small ◽

...

Keyword(s):

Genome Sequence ◽

Complex Traits ◽

Causal Variant ◽

Whole Genome Sequence ◽

Open Chromatin ◽

Whole Genome ◽

Rna Seq ◽

Derived Properties ◽

Causal Variants ◽

Genomic Regions

Genetic association mapping produces statistical links between phenotypes and genomic regions, but identifying the causal variants themselves remains difficult. Complete knowledge of all genetic variants, as provided by whole genome sequence (WGS), will help, but is currently financially prohibitive for well powered GWAS studies. To explore the advantages of WGS in a well powered setting, we performed eQTL mapping using WGS and RNA-seq, and showed that the lead eQTL variants called using WGS are more likely to be causal. We derived properties of the causal variant from simulation studies, and used these to propose a method for implicating likely causal SNPs. This method predicts that 25% - 70% of the causal variants lie in open chromatin regions, depending on tissue and experiment. Finally, we identify a set of high confidence causal variants and show that they are more enriched in GWAS associations than other eQTL. Of these, we find 65 associations with GWAS traits and show examples where the gene implicated by expression has been functionally validated as relevant for complex traits.

Download Full-text

SparsePro: an efficient genome-wide fine-mapping method integrating summary statistics and functional annotations

10.1101/2021.10.04.463133 ◽

2021 ◽

Author(s):

Wenmin Zhang ◽

Hamed S Najafabadi ◽

Yue Li

Keyword(s):

Fine Mapping ◽

Complex Traits ◽

Genetic Architecture ◽

Association Studies ◽

Computational Cost ◽

Mapping Method ◽

Genome Wide Association Studies ◽

Functional Annotations ◽

Genome Wide ◽

Causal Variants

Identifying causal variants from genome-wide association studies (GWASs) is challenging due to widespread linkage disequilibrium (LD). Functional annotations of the genome may help prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. However, classical fine-mapping methods have a high computational cost, particularly when the underlying genetic architecture and LD patterns are complex. Here, we propose a novel approach, SparsePro, to efficiently conduct functionally informed statistical fine-mapping. Our method enjoys two major innovations: First, by creating a sparse low-dimensional projection of the high-dimensional genotype, we enable a linear search of causal variants instead of an exponential search of causal configurations used in existing methods; Second, we adopt a probabilistic framework with a highly efficient variational expectation-maximization algorithm to integrate statistical associations and functional priors. We evaluate SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved more accurate and well-calibrated posterior inference with greatly reduced computation time. We demonstrate the utility of SparsePro by investigating the genetic architecture of five functional biomarkers of vital organs. We identify potential causal variants contributing to the genetically encoded coordination mechanisms between vital organs and pinpoint target genes with potential pleiotropic effects. In summary, we have developed an efficient genome-wide fine-mapping method with the ability to integrate functional annotations. Our method may have wide utility in understanding the genetics of complex traits as well as in increasing the yield of functional follow-up studies of GWASs.

Download Full-text

Widespread allelic heterogeneity in complex traits

10.1101/076984 ◽

2016 ◽

Cited By ~ 3

Author(s):

Farhad Hormozdiari ◽

Anthony Zhu ◽

Gleb Kichaev ◽

Ayellet V. Segrè ◽

Chelsea J.-T. Ju ◽

...

Keyword(s):

Complex Traits ◽

Statistical Power ◽

Association Studies ◽

Density Lipoprotein ◽

Computational Method ◽

Allelic Heterogeneity ◽

Genome Wide Association Studies ◽

New Methods ◽

Genome Wide ◽

Causal Variants

AbstractRecent successes in genome-wide association studies (GWASs) make it possible to address important questions about the genetic architecture of complex traits, such as allele frequency and effect size. One lesser-known aspect of complex traits is the extent of allelic heterogeneity (AH) arising from multiple causal variants at a locus. We developed a computational method to infer the probability of AH and applied it to three GWAS and four expression quantitative trait loci (eQTL) datasets. We identified a total of 4152 loci with strong evidence of AH. The proportion of all loci with identified AH is 4-23% in eQTLs, 35% in GWAS of High-Density Lipoprotein (HDL), and 23% in schizophrenia. For eQTLs, we observed a strong correlation between sample size and the proportion of loci with AH (R2=0.85, P = 2.2e-16), indicating that statistical power prevents identification of AH in other loci. Understanding the extent of AH may guide the development of new methods for fine mapping and association mapping of complex traits.

Download Full-text

Identifying causal variants by fine mapping across multiple studies

PLoS Genetics ◽

10.1371/journal.pgen.1009733 ◽

2021 ◽

Vol 17 (9) ◽

pp. e1009733

Author(s):

Nathan LaPierre ◽

Kodi Taraszka ◽

Helen Huang ◽

Rosemary He ◽

Farhad Hormozdiari ◽

...

Keyword(s):

Fine Mapping ◽

Complex Traits ◽

Association Studies ◽

Density Lipoprotein ◽

Genome Wide Association Studies ◽

Multivariate Normal ◽

Multiple Study ◽

Genome Wide ◽

Causal Variants ◽

Different Populations

Increasingly large Genome-Wide Association Studies (GWAS) have yielded numerous variants associated with many complex traits, motivating the development of “fine mapping” methods to identify which of the associated variants are causal. Additionally, GWAS of the same trait for different populations are increasingly available, raising the possibility of refining fine mapping results further by leveraging different linkage disequilibrium (LD) structures across studies. Here, we introduce multiple study causal variants identification in associated regions (MsCAVIAR), a method that extends the popular CAVIAR fine mapping framework to a multiple study setting using a random effects model. MsCAVIAR only requires summary statistics and LD as input, accounts for uncertainty in association statistics using a multivariate normal model, allows for multiple causal variants at a locus, and explicitly models the possibility of different SNP effect sizes in different populations. We demonstrate the efficacy of MsCAVIAR in both a simulation study and a trans-ethnic, trans-biobank fine mapping analysis of High Density Lipoprotein (HDL).

Download Full-text

Investigating the Effect of Imputed Structural Variants from Whole-Genome Sequence on Genome-Wide Association and Genomic Prediction in Dairy Cattle

Animals ◽

10.3390/ani11020541 ◽

2021 ◽

Vol 11 (2) ◽

pp. 541

Author(s):

Long Chen ◽

Jennie E. Pryce ◽

Ben J. Hayes ◽

Hans D. Daetwyler

Keyword(s):

Dairy Cattle ◽

Genomic Prediction ◽

Complex Traits ◽

Prediction Accuracy ◽

Association Studies ◽

Genome Wide Association ◽

Whole Genome Sequence ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Genome Wide

Structural variations (SVs) are large DNA segments of deletions, duplications, copy number variations, inversions and translocations in a re-sequenced genome compared to a reference genome. They have been found to be associated with several complex traits in dairy cattle and could potentially help to improve genomic prediction accuracy of dairy traits. Imputation of SVs was performed in individuals genotyped with single-nucleotide polymorphism (SNP) panels without the expense of sequencing them. In this study, we generated 24,908 high-quality SVs in a total of 478 whole-genome sequenced Holstein and Jersey cattle. We imputed 4489 SVs with R2 > 0.5 into 35,568 Holstein and Jersey dairy cattle with 578,999 SNPs with two pipelines, FImpute and Eagle2.3-Minimac3. Genome-wide association studies for production, fertility and overall type with these 4489 SVs revealed four significant SVs, of which two were highly linked to significant SNP. We also estimated the variance components for SNP and SV models for these traits using genomic best linear unbiased prediction (GBLUP). Furthermore, we assessed the effect on genomic prediction accuracy of adding SVs to GBLUP models. The estimated percentage of genetic variance captured by SVs for production traits was up to 4.57% for milk yield in bulls and 3.53% for protein yield in cows. Finally, no consistent increase in genomic prediction accuracy was observed when including SVs in GBLUP.

Download Full-text

Integrating gene expression with summary association statistics to identify susceptibility genes for 30 complex traits

10.1101/072967 ◽

2016 ◽

Cited By ~ 2

Author(s):

Nicholas Mancuso ◽

Huwenbo Shi ◽

Pagé Goddard ◽

Gleb Kichaev ◽

Alexander Gusev ◽

...

Keyword(s):

Gene Expression ◽

Genetic Correlation ◽

Complex Traits ◽

Association Studies ◽

Susceptibility Genes ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Causal Variants

AbstractAlthough genome-wide association studies (GWASs) have identified thousands of risk loci for many complex traits and diseases, the causal variants and genes at these loci remain largely unknown. We leverage recently introduced methods to integrate gene expression measurements from 45 expression panels with summary GWAS data to perform 30 transcriptome-wide association studies (TWASs). We identify 1,196 susceptibility genes whose expression is associated with these traits; of these, 168 reside more than 0.5Mb away from any previously reported GWAS significant variant, thus providing new risk loci. Second, we find 43 pairs of traits with significant genetic correlation at the level of predicted expression; of these, 8 are not found through genetic correlation at the SNP level. Third, we use bi-directional regression to find evidence for BMI causally influencing triglyceride levels, and triglyceride levels causally influencing LDL. Taken together, our results provide insights into the role of expression to susceptibility of complex traits and diseases.

Download Full-text

Identifying Causal Variants by Fine Mapping Across Multiple Studies

10.1101/2020.01.15.908517 ◽

2020 ◽

Cited By ~ 2

Author(s):

Nathan LaPierre ◽

Kodi Taraszka ◽

Helen Huang ◽

Rosemary He ◽

Farhad Hormozdiari ◽

...

Keyword(s):

Fine Mapping ◽

Complex Traits ◽

Association Studies ◽

Genome Wide Association Studies ◽

Multiple Study ◽

Current State ◽

Genome Wide ◽

Causal Variants ◽

Different Populations

AbstractIncreasingly large Genome-Wide Association Studies (GWAS) have yielded numerous variants associated with many complex traits, motivating the development of “fine mapping” methods to identify which of the associated variants are causal. Additionally, GWAS of the same trait for different populations are increasingly available, raising the possibility of refining fine mapping results further by leveraging different linkage disequilibrium (LD) structures across studies. Here, we introduce multiple study causal variants identification in associated regions (MsCAVIAR), a method that extends the popular CAVIAR fine mapping framework to a multiple study setting using a random effects model. MsCAVIAR only requires summary statistics and LD as input, accounts for uncertainty in association statistics using a multivariate normal model, allows for multiple causal variants at a locus, and explicitly models the possibility of different SNP effect sizes in different populations. In a trans-ethnic, trans-biobank Type 2 Diabetes analysis, we show that MsCAVIAR returns causal set sizes that are over 20% smaller than those given by current state of the art methods for trans-ethnic fine-mapping.

Download Full-text

A unifying framework for joint trait analysis under a non-infinitesimal model

10.1101/293803 ◽

2018 ◽

Author(s):

Ruth Johnson ◽

Huwenbo Shi ◽

Bogdan Pasaniuc ◽

Sriram Sankararaman

Keyword(s):

Complex Traits ◽

Association Studies ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Posterior Density ◽

Trait Analysis ◽

Genome Wide ◽

Genetic Overlap ◽

Causal Variants ◽

Supplementary Material

AbstractMotivationA large proportion of risk regions identified by genome-wide association studies (GWAS) are shared across multiple diseases and traits. Understanding whether this clustering is due to sharing of causal variants or chance colocalization can provide insights into shared etiology of complex traits and diseases.ResultsIn this work, we propose a flexible, unifying framework to quantify the overlap between a pair of traits called UNITY (Unifying Non-Infinitesimal Trait analYsis). We formulate a Bayesian generative model that relates the overlap between pairs of traits to GWAS summary statistic data under a non-infinitesimal genetic architecture underlying each trait. We propose a Metropolis-Hastings sampler to compute the posterior density of the genetic overlap parameters in this model. We validate our method through comprehensive simulations and analyze summary statistics from height and BMI GWAS to show that it produces estimates consistent with the known genetic makeup of both traits.AvailabilityThe UNITY software is made freely available to the research community at: https://github.com/bogdanlab/[email protected] informationSupplementary data are available at Bioinformatics online.

Download Full-text