scholarly journals SeqBreed: a python tool to evaluate genomic prediction in complex scenarios

2019 ◽  
Author(s):  
M. Pérez-Enciso ◽  
L. C. Ramírez-Ayala ◽  
L.M. Zingaretti

AbstractBackgroundGenomic Prediction (GP) is the procedure whereby molecular information is used to predict complex phenotypes. Although GP can significantly enhance predictive accuracy, it can be expensive and difficult to implement. To help in designing optimum experiments, including genome wide association studies and genomic selection experiments, we have developed SeqBreed, a generic and flexible python3 forward simulator.ResultsSeqBreed accommodates sex and mitochondrion chromosomes as well as autopolyploidy. It can simulate any number of complex phenotypes determined by any number of causal loci. SeqBreed implements several GP methods, including single step GBLUP. We demonstrate its functionality with Drosophila Genome Reference Panel (DGRP) sequence data and with tetraploid potato genotypes.ConclusionsSeqBreed is a flexible and easy to use tool appropriate for optimizing GP or genome wide association studies. It incorporates some of the most popular GP methods and includes several visualization tools. Code is open and can be freely modified. Software, documentation and examples are available at https://github.com/miguelperezenciso/SeqBreed.

Genes ◽  
2019 ◽  
Vol 10 (9) ◽  
pp. 669 ◽  
Author(s):  
Peter S. Kristensen ◽  
Just Jensen ◽  
Jeppe R. Andersen ◽  
Carlos Guzmán ◽  
Jihad Orabi ◽  
...  

Use of genetic markers and genomic prediction might improve genetic gain for quality traits in wheat breeding programs. Here, flour yield and Alveograph quality traits were inspected in 635 F6 winter wheat breeding lines from two breeding cycles. Genome-wide association studies revealed single nucleotide polymorphisms (SNPs) on chromosome 5D significantly associated with flour yield, Alveograph P (dough tenacity), and Alveograph W (dough strength). Additionally, SNPs on chromosome 1D were associated with Alveograph P and W, SNPs on chromosome 1B were associated with Alveograph P, and SNPs on chromosome 4A were associated with Alveograph L (dough extensibility). Predictive abilities based on genomic best linear unbiased prediction (GBLUP) models ranged from 0.50 for flour yield to 0.79 for Alveograph W based on a leave-one-out cross-validation strategy. Predictive abilities were negatively affected by smaller training set sizes, lower genetic relationship between lines in training and validation sets, and by genotype–environment (G×E) interactions. Bayesian Power Lasso models and genomic feature models resulted in similar or slightly improved predictions compared to GBLUP models. SNPs with the largest effects can be used for screening large numbers of lines in early generations in breeding programs to select lines that potentially have good quality traits. In later generations, genomic predictions might be used for a more accurate selection of high quality wheat lines.


GigaScience ◽  
2020 ◽  
Vol 9 (8) ◽  
Author(s):  
Arash Bayat ◽  
Piotr Szul ◽  
Aidan R O’Brien ◽  
Robert Dunne ◽  
Brendan Hosking ◽  
...  

Abstract Background Many traits and diseases are thought to be driven by >1 gene (polygenic). Polygenic risk scores (PRS) hence expand on genome-wide association studies by taking multiple genes into account when risk models are built. However, PRS only considers the additive effect of individual genes but not epistatic interactions or the combination of individual and interacting drivers. While evidence of epistatic interactions ais found in small datasets, large datasets have not been processed yet owing to the high computational complexity of the search for epistatic interactions. Findings We have developed VariantSpark, a distributed machine learning framework able to perform association analysis for complex phenotypes that are polygenic and potentially involve a large number of epistatic interactions. Efficient multi-layer parallelization allows VariantSpark to scale to the whole genome of population-scale datasets with 100,000,000 genomic variants and 100,000 samples. Conclusions Compared with traditional monogenic genome-wide association studies, VariantSpark better identifies genomic variants associated with complex phenotypes. VariantSpark is 3.6 times faster than ReForeSt and the only method able to scale to ultra-high-dimensional genomic data in a manageable time.


2020 ◽  
Vol 103 (11) ◽  
pp. 10347-10360
Author(s):  
Pamela I. Otto ◽  
Simone E.F. Guimarães ◽  
Mario P.L. Calus ◽  
Jeremie Vandenplas ◽  
Marco A. Machado ◽  
...  

Animals ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 541
Author(s):  
Long Chen ◽  
Jennie E. Pryce ◽  
Ben J. Hayes ◽  
Hans D. Daetwyler

Structural variations (SVs) are large DNA segments of deletions, duplications, copy number variations, inversions and translocations in a re-sequenced genome compared to a reference genome. They have been found to be associated with several complex traits in dairy cattle and could potentially help to improve genomic prediction accuracy of dairy traits. Imputation of SVs was performed in individuals genotyped with single-nucleotide polymorphism (SNP) panels without the expense of sequencing them. In this study, we generated 24,908 high-quality SVs in a total of 478 whole-genome sequenced Holstein and Jersey cattle. We imputed 4489 SVs with R2 > 0.5 into 35,568 Holstein and Jersey dairy cattle with 578,999 SNPs with two pipelines, FImpute and Eagle2.3-Minimac3. Genome-wide association studies for production, fertility and overall type with these 4489 SVs revealed four significant SVs, of which two were highly linked to significant SNP. We also estimated the variance components for SNP and SV models for these traits using genomic best linear unbiased prediction (GBLUP). Furthermore, we assessed the effect on genomic prediction accuracy of adding SVs to GBLUP models. The estimated percentage of genetic variance captured by SVs for production traits was up to 4.57% for milk yield in bulls and 3.53% for protein yield in cows. Finally, no consistent increase in genomic prediction accuracy was observed when including SVs in GBLUP.


2019 ◽  
Vol 61 (1) ◽  
pp. 113-115 ◽  
Author(s):  
Francisco Ribeiro de Araujo Neto ◽  
Daniel Jordan de Abreu Santos ◽  
Gerardo Alves Fernandes Júnior ◽  
Rusbel Raul Aspilcueta-Borquis ◽  
André Vieira do Nascimento ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document