scholarly journals BLINK: A Package for Next Level of Genome Wide Association Studies with Both Individuals and Markers in Millions

2017 ◽  
Author(s):  
Meng Huang ◽  
Xiaolei Liu ◽  
Yao Zhou ◽  
Ryan M. Summers ◽  
Zhiwu Zhang

Big data, accumulated from biomedical and agronomic studies, provides the potential to identify genes controlling complex human diseases and agriculturally important traits through genome-wide association studies (GWAS). However, big data also leads to extreme computational challenges, especially when sophisticated statistical models are employed to simultaneously reduce false positives and false negatives. The newly developed Fixed and random model Circulating Probability Unification (FarmCPU) method uses a bin method under the assumption that Quantitative Trait Nucleotides (QTNs) are evenly distributed throughout the genome. The estimated QTNs are used to separate a mixed linear model into a computationally efficient fixed effect model (FEM) and a computationally expensive random effect model (REM), which are then used iteratively. To completely eliminate the computationally expensive REM, we replaced REM with FEM by using Bayesian information criteria. To eliminate the requirement that QTNs be evenly distributed throughout the genome, we replaced the bin method with linkage disequilibrium information. The new method is called Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK). Both real and simulated data analyses demonstrated that BLINK improves statistical power compared to FarmCPU, in addition to a remarkable improvement in computing time. Now, a dataset with half million markers and one million individuals can be analyzed within five hours, compared with one week using FarmCPU.

Agronomy ◽  
2020 ◽  
Vol 10 (12) ◽  
pp. 2006
Author(s):  
David P. Horvath ◽  
Michael Stamm ◽  
Zahirul I. Talukder ◽  
Jason Fiedler ◽  
Aidan P. Horvath ◽  
...  

A diverse population (429 member) of canola (Brassica napus L.) consisting primarily of winter biotypes was assembled and used in genome-wide association studies. Genotype by sequencing analysis of the population identified and mapped 290,972 high-quality markers ranging from 18.5 to 82.4% missing markers per line and an average of 36.8%. After interpolation, 251,575 high-quality markers remained. After filtering for markers with low minor allele counts (count > 5), we were left with 190,375 markers. The average distance between these markers is 4463 bases with a median of 69 and a range from 1 to 281,248 bases. The heterozygosity among the imputed population ranges from 0.9 to 11.0% with an average of 5.4%. The filtered and imputed dataset was used to determine population structure and kinship, which indicated that the population had minimal structure with the best K value of 2–3. These results also indicated that the majority of the population has substantial sequence from a single population with sub-clusters of, and admixtures with, a very small number of other populations. Analysis of chromosomal linkage disequilibrium decay ranged from ~7 Kb for chromosome A01 to ~68 Kb for chromosome C01. Local linkage decay rates determined for all 500 kb windows with a 10kb sliding step indicated a wide range of linkage disequilibrium decay rates, indicating numerous crossover hotspots within this population, and provide a resource for determining the likely limits of linkage disequilibrium from any given marker in which to identify candidate genes. This population and the resources provided here should serve as helpful tools for investigating genetics in winter canola.


Author(s):  
Emma F. Magavern ◽  
Helen R. Warren ◽  
Fu L. Ng ◽  
Claudia P. Cabrera ◽  
Patricia B. Munroe ◽  
...  

At the dawn of the new decade, it is judicious to reflect on the boom of knowledge about polygenic risk for essential hypertension supplied by the wealth of genome-wide association studies. Hypertension continues to account for significant cardiovascular morbidity and mortality, with increasing prevalence anticipated. Here, we overview recent advances in the use of big data to understand polygenic hypertension, as well as opportunities for future innovation to translate this windfall of knowledge into clinical benefit.


Sign in / Sign up

Export Citation Format

Share Document