scholarly journals Biological insights from self-perceived facial aging data of the UKBB participants

2019 ◽  
Author(s):  
Simona Vigodner ◽  
Raya Khanin

AbstractGenetic underpinnings of facial aging are still largely unknown. In this study, we leverage the statistical power of large-scale data from the UK Biobank and perform insilico analysis of genome-wide self-perceived facial aging. Functional analysis reveals significant over-representation of skin pigmentation and immune related pathways that are correlated with facial aging. For males, hair loss is one of the top categories that is highly significantly over-represented in the genetics data associated with self-reported facial aging. Our analysis confirms that genes coding for the extracellular matrix play important roles in aging. Overall, our results provide evidence that while somewhat biased, large-scale self-reported data on aging can be utilized for extracting useful insights into underlying biology, provide candidate skin aging biomarkers, and advance anti-aging skincare.

2021 ◽  
Author(s):  
Runqing Yang ◽  
Yuxin Song ◽  
Li Jiang ◽  
Zhiyu Hao ◽  
Runqing Yang

Abstract Complex computation and approximate solution hinder the application of generalized linear mixed models (GLMM) into genome-wide association studies. We extended GRAMMAR to handle binary diseases by considering genomic breeding values (GBVs) estimated in advance as a known predictor in genomic logit regression, and then controlled polygenic effects by regulating downward genomic heritability. Using simulations and case analyses, we showed in optimizing GRAMMAR, polygenic effects and genomic controls could be evaluated using the fewer sampling markers, which extremely simplified GLMM-based association analysis in large-scale data. In addition, joint analysis for quantitative trait nucleotide (QTN) candidates chosen by multiple testing offered significant improved statistical power to detect QTNs over existing methods.


2015 ◽  
Author(s):  
Liya Wang ◽  
Peter Van Buren ◽  
Doreen Ware

Over the past few years, cloud-based platforms have been proposed to address storage, management, and computation of large-scale data, especially in the field of genomics. However, for collaboration efforts involving multiple institutes, data transfer and management, interoperability and standardization among different platforms have imposed new challenges. This paper proposes a distributed bioinformatics platform that can leverage local clusters with remote computational clusters for genomic analysis using the unified bioinformatics workflow. The platform is built with a data server configured with iRODS, a computation cluster authenticated with iPlant Agave system, and web server to interact with the platform. A Genome-Wide Association Study workflow is integrated to validate the feasibility of the proposed approach.


2019 ◽  
Author(s):  
Longda Jiang ◽  
Zhili Zheng ◽  
Ting Qi ◽  
Kathryn E. Kemper ◽  
Naomi R. Wray ◽  
...  

ABSTRACTThe genome-wide association study (GWAS) has been widely used as an experimental design to detect associations between genetic variants and a phenotype. Two major confounding factors, population stratification and relatedness, could potentially lead to inflated GWAS test-statistics and thereby spurious associations. Mixed linear model (MLM)-based approaches can be used to account for sample structure. However, genome-wide association (GWA) analyses in biobank samples such as the UK Biobank (UKB) often exceed the capability of most existing MLM-based tools especially if the number of traits is large. Here, we developed an MLM-based tool (called fastGWA) that controls for population stratification by principal components and relatedness by a sparse genetic relationship matrix for GWA analyses of biobank-scale data. We demonstrated by extensive simulations that fastGWA is reliable, robust and highly resource-efficient. We then applied fastGWA to 2,173 traits on 456,422 array-genotyped and imputed individuals and 2,048 traits on 46,191 whole-exome-sequenced individuals in the UKB.


2021 ◽  
Author(s):  
Suyash S Shringarpure ◽  
Wei Wang ◽  
Yunxuan Jiang ◽  
Alison Acevedo ◽  
Devika Dhamija ◽  
...  

A key challenge in the study of rare disease genetics is assembling large case cohorts for well- powered studies. We demonstrate the use of self-reported diagnosis data to study rare diseases at scale. We performed genome-wide association studies (GWAS) for 33 rare diseases using self-reported diagnosis phenotypes and re-discovered 29 known associations to validate our approach. In addition, we performed the first GWAS for Duane retraction syndrome, vestibular schwannoma and spontaneous pneumothorax, and report novel genome-wide significant associations for these diseases. We replicated these novel associations in non-European populations within the 23andMe, Inc. cohort as well as in the UK Biobank cohort. We also show that mixed model analyses including all ethnicities and related samples increase the power for finding associations in rare diseases. Our results, based on analysis of 19,084 rare disease cases for 33 diseases from 7 populations, show that large-scale online collection of self-reported data is a viable method for discovery and replication of genetic associations for rare diseases. This approach, which is complementary to sequencing-based approaches, will enable the discovery of more novel genetic associations for increasingly rare diseases across multiple ancestries and shed more light on the genetic architecture of rare diseases.


2021 ◽  
Author(s):  
Adina S. Wagner ◽  
Laura K. Waite ◽  
Małgorzata Wierzba ◽  
Felix Hoffstaedter ◽  
Alexander Q. Waite ◽  
...  

Large-scale datasets present unique opportunities to perform scientific investigations with unprecedented breadth. However, they also pose considerable challenges for the findability, accessibility, interoperability, and reusability (FAIR) of research outcomes due to infrastructure limitations, data usage constraints, or software license restrictions. Here we introduce a DataLad-based, domain-agnostic framework suitable for reproducible data processing in compliance with open science mandates. The framework attempts to minimize platform idiosyncrasies and performance-related complexities. It affords the capture of machine-actionable computational provenance records that can be used to retrace and verify the origins of research outcomes, as well as be re-executed independent of the original computing infrastructure. We demonstrate the framework's performance using two showcases: one highlighting data sharing and transparency (using the studyforrest.org dataset) and another highlighting scalability (using the largest public brain imaging dataset available: the UK Biobank dataset).


2016 ◽  
Author(s):  
Gang Wu ◽  
Ron C Anafi ◽  
Michael E Hughes ◽  
Karl Kornacker ◽  
John B Hogenesch

Summary: Detecting periodicity in large scale data remains a challenge. Different algorithms offer strengths and weaknesses in statistical power, sensitivity to outliers, ease of use, and sampling requirements. While efforts have been made to identify best of breed algorithms, relatively little research has gone into integrating these methods in a generalizable method. Here we present MetaCycle, an R package that incorporates ARSER, JTK_CYCLE, and Lomb-Scargle to conveniently evaluate periodicity in time-series data. Availability and implementation: MetaCycle package is available on the CRAN repository (https://cran.r-project.org/web/packages/MetaCycle/index.html) and GitHub (https://github.com/gangwug/MetaCycle). Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


Plants ◽  
2019 ◽  
Vol 8 (9) ◽  
pp. 309 ◽  
Author(s):  
Anna V. Klepikova ◽  
Aleksey A. Penin

For many years, progress in the identification of gene functions has been based on classical genetic approaches. However, considerable recent omics developments have brought to the fore indirect but high-resolution methods of gene function identification such as transcriptomics, proteomics, and metabolomics. A transcriptome map is a powerful source of functional information and the result of the genome-wide expression analysis of a broad sampling of tissues and/or organs from different developmental stages and/or environmental conditions. In plant science, the application of transcriptome maps extends from the inference of gene regulatory networks to evolutionary studies. However, only some of these data have been integrated into databases, thus enabling analyses to be conducted without raw data; without this integration, extensive data preprocessing is required, which limits data usability. In this review, we summarize the state of plant transcriptome maps, analyze the problems associated with the combined analysis of large-scale data from various studies, and outline possible solutions to these problems.


Sign in / Sign up

Export Citation Format

Share Document