SNP data analysis in genome-wide association studies

Genome-wide association studies (GWAS) or genetic data analysis is used to discover common genetic factors which influence the health of human beings and become a part of a disease. The concept of using genomics has increased in recent years, especially in e-healthcare. Today there is huge improvement required in this field or genomics. Note that the terms genomics and genetics are not similar terms here. Basically, the human genome is made up of DNA, which consists of four different chemical building blocks (called bases and abbreviated A, T, C, and G). Based on this, we differentiate each and every human being living on earth. The term ‘genetics' originated from the Greek word ‘genetikos'. It means ‘origin'. In simple terms, genetics can be defined as a branch of biology, which deals with the study of the functionalities and composition of a single gene in an organism. There are mainly three branches of genetics, which include classical genetics, molecular genetics, and population genetics.

Download Full-text

GSEA-SNP: applying gene set enrichment analysis to SNP data from genome-wide association studies

Bioinformatics ◽

10.1093/bioinformatics/btn516 ◽

2008 ◽

Vol 24 (23) ◽

pp. 2784-2785 ◽

Cited By ~ 119

Author(s):

Marit Holden ◽

Shiwei Deng ◽

Leszek Wojnowski ◽

Bettina Kulle

Keyword(s):

Association Studies ◽

Enrichment Analysis ◽

Gene Set Enrichment Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Gene Set Enrichment ◽

Gene Set ◽

Snp Data ◽

Genome Wide

Download Full-text

Longitudinal Data Analysis in Genome-Wide Association Studies

Genetic Epidemiology ◽

10.1002/gepi.21828 ◽

2014 ◽

Vol 38 (S1) ◽

pp. S68-S73 ◽

Cited By ~ 7

Author(s):

Joseph Beyene ◽

Jemila S. Hamid

Keyword(s):

Data Analysis ◽

Longitudinal Data ◽

Association Studies ◽

Longitudinal Data Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

Data analysis of Genome-Wide Association studies (GWAS) concerning rheumatoid arthritis and multiple sclerosis

Proceedings of the 10th IEEE International Conference on Information Technology and Applications in Biomedicine ◽

10.1109/itab.2010.5687817 ◽

2010 ◽

Author(s):

Konstantina Nikolaou ◽

Fanis G. Kalatzis ◽

Nikolaos Giannakeas ◽

Themis P. Exarchos ◽

Sofia Markoula ◽

...

Keyword(s):

Rheumatoid Arthritis ◽

Multiple Sclerosis ◽

Data Analysis ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

EigenGWAS: finding loci under selection through genome-wide association studies of eigenvectors in structured populations

10.1101/023457 ◽

2015 ◽

Cited By ~ 1

Author(s):

Guo-Bo Chen ◽

Sang Hong Lee ◽

Zhi-Xiang Zhu ◽

Beben Benyamin ◽

Matthew R Robinson

Keyword(s):

Association Studies ◽

Structured Populations ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Statistical Framework ◽

Snp Data ◽

Genome Wide ◽

Single Marker ◽

Marker Regression ◽

Value Decomposition

We apply the statistical framework for genome-wide association studies (GWAS) to eigenvector decomposition (EigenGWAS), which is commonly used in population genetics to characterise the structure of genetic data. We show that loci under selection can be detected in a structured population by using eigenvectors as phenotypes in a single-marker GWAS. We find LCT to be under selection between HapMap CEU-TSI cohorts, a finding that was replicated across European countries in the POPRES samples. HERC2 was also found to be differentiated between both the CEU-TSI cohort and among POPRES samples, reflecting the likely anthropological differences in skin and hair colour between northern and southern European populations. We show that when determining the effect of a SNP on an eigenvector, three methods of single-marker regression of eigenvectors, best linear unbiased prediction of eigenvectors, and singular value decomposition of SNP data are equivalent to each other. We also demonstrate that estimated SNP effects on eigenvectors from a reference panel can be used to predict eigenvectors (the projected eigenvectors) in a target sample with high accuracy, particularly for the primary eigenvectors. Under this GWAS framework, ancestry informative markers and loci under selection can be identified, and population structure can be captured and easily interpreted. We have developed freely available software to facilitate the application of the methods (https://github.com/gc5k/GEAR/wiki/EigenGWAS).

Download Full-text

SNPpy - Database Management for SNP Data from Genome Wide Association Studies

PLoS ONE ◽

10.1371/journal.pone.0024982 ◽

2011 ◽

Vol 6 (10) ◽

pp. e24982 ◽

Cited By ~ 6

Author(s):

Faheem Mitha ◽

Herodotos Herodotou ◽

Nedyalko Borisov ◽

Chen Jiang ◽

Josh Yoder ◽

...

Keyword(s):

Database Management ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Snp Data ◽

Genome Wide

Download Full-text

LMM-22: An Enhanced Linear Mixed Model (LMM) Approach for Genome-Wide Association Studies (GWAS) for the Prediction of Diseases and Traits among Humans from Genomics Data

10.20944/preprints202005.0154.v1 ◽

2020 ◽

Author(s):

Siddharth Sharma

Keyword(s):

Data Analysis ◽

Graphics Processing Units ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Analysis Data ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Quality Of Data ◽

Genome Wide

Increasingly, genomics is being used for the prediction of specific traits and diseases (phenotypes) among humans. Wider availability of genomics data through multiple research projects (such as International HapMap Project1 and 1000 Genomes2) has been a catalyst in that direction. With the recent advances in machine learning and big data analysis, data computation resources and data models needed for genomics data analysis are readily available. However, the prediction of traits and diseases has its own challenges in terms of computational requirements and computational analysis, statistical analysis (example: confounding variables), and limited quality of data collection. Linear Mixed Models (LMM, a type of linear regression) is a common approach for Genome-wide Association Studies (GWAS) for the prediction of common traits among humans using genomics. This paper researches the existing LMM-based approaches for Genome-wide Association Studies (GWAS), describes the experiment performed on FaST-LMM approach from Microsoft Research, and then proposes an enhanced approach (called LMM-22) on how to address computational and statistical issues. LMM-22 focuses on the parallelization of LMM computations and execution of LMM-22 on General Purpose Graphics Processing Units (GPU) as against CPUs to accelerate the LMM approach for GWAS studies.

Download Full-text

Gene Set Analysis of SNP Data from Genome-wide Association Studies

Bioinformatics in Aquaculture ◽

10.1002/9781118782392.ch24 ◽

2017 ◽

pp. 434-459

Author(s):

Shikai Liu ◽

Peng Zeng ◽

Zhanjiang Liu

Keyword(s):

Association Studies ◽

Genome Wide Association ◽

Gene Set Analysis ◽

Genome Wide Association Studies ◽

Gene Set ◽

Snp Data ◽

Genome Wide

Download Full-text

Introduction to Statistical Methods for Integrative Data Analysis in Genome-Wide Association Studies

Big Data Analytics in Genomics ◽

10.1007/978-3-319-41279-5_1 ◽

2016 ◽

pp. 3-23 ◽

Cited By ~ 2

Author(s):

Can Yang ◽

Xiang Wan ◽

Jin Liu ◽

Michael Ng

Keyword(s):

Data Analysis ◽

Statistical Methods ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Integrative Data Analysis ◽

Genome Wide

Download Full-text