Genomic prediction of weight and wool traits in a multi-breed sheep population

N. Moghaddar; A. A. Swan; J. H. J. van der Werf

doi:10.1071/an13129

Genomic prediction of weight and wool traits in a multi-breed sheep population

Animal Production Science ◽

10.1071/an13129 ◽

2014 ◽

Vol 54 (5) ◽

pp. 544 ◽

Cited By ~ 4

Author(s):

N. Moghaddar ◽

A. A. Swan ◽

J. H. J. van der Werf

Keyword(s):

Genomic Prediction ◽

Pearson Correlation ◽

Reference Population ◽

Breeding Value ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Sheep Population ◽

Australian Sheep ◽

Estimated Breeding Value ◽

Breed Sheep

The objective of this study was to predict the accuracy of genomic prediction for 26 traits, including weight, muscle, fat, and wool quantity and quality traits, in Australian sheep based on a large, multi-breed reference population. The reference population consisted of two research flocks, with the main breeds being Merino, Border Leicester (BL), Poll Dorset (PD), and White Suffolk (WS). The genomic estimated breeding value (GEBV) was based on GBLUP (genomic best linear unbiased prediction), applying a genomic relationship matrix calculated from the 50K Ovine SNP chip marker genotypes. The accuracy of GEBV was evaluated as the Pearson correlation coefficient between GEBV and accurate estimated breeding value based on progeny records in a set of genotyped industry animals. The accuracies of weight traits were relatively low to moderate in PD and WS breeds (0.11–0.27) and moderate to relatively high in BL and Merino (0.25–0.63). The accuracy of muscle and fat traits was moderate to relatively high across all breeds (between 0.21 and 0.55). The accuracy of GEBV of yearling and adult wool traits in Merino was, on average, high (0.33–0.75). The results showed the accuracy of genomic prediction depends on trait heritability and the effective size of the reference population, whereas the observed GEBV accuracies were more related to the breed proportions in the multi-breed reference population. No extra gain in within-breed GEBV accuracy was observed based on across breed information. More investigations are required to determine the precise effect of across-breed information on within-breed genomic prediction.

Download Full-text

Exploring the size of reference population for expected accuracy of genomic prediction using simulated and real data in Japanese Black cattle

BMC Genomics ◽

10.1186/s12864-021-08121-z ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Masayuki Takeda ◽

Keiichi Inoue ◽

Hidemi Oyama ◽

Katsuo Uchiyama ◽

Kanako Yoshinari ◽

...

Keyword(s):

Genomic Prediction ◽

Simulation Analysis ◽

Carcass Traits ◽

Real Data ◽

Reference Population ◽

Weighting Factor ◽

Breeding Value ◽

Japanese Black Cattle ◽

Simulation Results ◽

Estimated Breeding Value

Abstract Background Size of reference population is a crucial factor affecting the accuracy of prediction of the genomic estimated breeding value (GEBV). There are few studies in beef cattle that have compared accuracies achieved using real data to that achieved with simulated data and deterministic predictions. Thus, extent to which traits of interest affect accuracy of genomic prediction in Japanese Black cattle remains obscure. This study aimed to explore the size of reference population for expected accuracy of genomic prediction for simulated and carcass traits in Japanese Black cattle using a large amount of samples. Results A simulation analysis showed that heritability and size of reference population substantially impacted the accuracy of GEBV, whereas the number of quantitative trait loci did not. The estimated numbers of independent chromosome segments (Me) and the related weighting factor (w) derived from simulation results and a maximum likelihood (ML) approach were 1900–3900 and 1, respectively. The expected accuracy for trait with heritability of 0.1–0.5 fitted well with empirical values when the reference population comprised > 5000 animals. The heritability for carcass traits was estimated to be 0.29–0.41 and the accuracy of GEBVs was relatively consistent with simulation results. When the reference population comprised 7000–11,000 animals, the accuracy of GEBV for carcass traits can range 0.73–0.79, which is comparable to estimated breeding value obtained in the progeny test. Conclusion Our simulation analysis demonstrated that the expected accuracy of GEBV for a polygenic trait with low-to-moderate heritability could be practical in Japanese Black cattle population. For carcass traits, a total of 7000–11,000 animals can be a sufficient size of reference population for genomic prediction.

Download Full-text

Accuracy of estimated genomic breeding values for wool and meat traits in a multi-breed sheep population

Animal Production Science ◽

10.1071/an10096 ◽

2010 ◽

Vol 50 (12) ◽

pp. 1004 ◽

Cited By ~ 64

Author(s):

H. D. Daetwyler ◽

J. M. Hickey ◽

J. M. Henshall ◽

S. Dominik ◽

B. Gredler ◽

...

Keyword(s):

Reference Population ◽

Eye Muscle ◽

Single Nucleotide ◽

Sheep Population ◽

Sheep Breeding ◽

Breeding Values ◽

Lower Accuracy ◽

Australian Sheep ◽

Estimated Breeding Values ◽

Breed Sheep

Estimated breeding values for the selection of more profitable sheep for the sheep meat and wool industries are currently based on pedigree and phenotypic records. With the advent of a medium-density DNA marker array, which genotypes ~50 000 ovine single nucleotide polymorphisms, a third source of information has become available. The aim of this paper was to determine whether this genomic information can be used to predict estimated breeding values for wool and meat traits. The effects of all single nucleotide polymorphism markers in a multi-breed sheep reference population of 7180 individuals with phenotypic records were estimated to derive prediction equations for genomic estimated breeding values (GEBV) for greasy fleece weight, fibre diameter, staple strength, breech wrinkle score, weight at ultrasound scanning, scanned eye muscle depth and scanned fat depth. Five hundred and forty industry sires with very accurate Australian sheep breeding values were used as a validation population and the accuracies of GEBV were assessed according to correlations between GEBV and Australian sheep breeding values . The accuracies of GEBV ranged from 0.15 to 0.79 for wool traits in Merino sheep and from –0.07 to 0.57 for meat traits in all breeds studied. Merino industry sires tended to have more accurate GEBV than terminal and maternal breeds because the reference population consisted mainly of Merino haplotypes. The lower accuracy for terminal and maternal breeds suggests that the density of genetic markers used was not high enough for accurate across-breed prediction of marker effects. Our results indicate that an increase in the size of the reference population will increase the accuracy of GEBV.

Download Full-text

On the use of whole-genome sequence data for across-breed genomic prediction and fine-scale mapping of QTL

Genetics Selection Evolution ◽

10.1186/s12711-021-00607-4 ◽

2021 ◽

Vol 53 (1) ◽

Author(s):

Theo Meuwissen ◽

Irene van den Berg ◽

Mike Goddard

Keyword(s):

Variable Selection ◽

Genome Sequence ◽

Genomic Prediction ◽

Milk Fat ◽

Genotype Imputation ◽

Whole Genome Sequence ◽

Genomic Relationship Matrix ◽

Polygenic Effect ◽

Relationship Matrix ◽

Whole Genome

Abstract Background Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. Methods The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. Results The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. Conclusions Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.

Download Full-text

Application of Low Coverage Genotyping by Sequencing in Selectively Bred Arctic Charr (Salvelinus alpinus)

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401295 ◽

2020 ◽

Vol 10 (6) ◽

pp. 2069-2078 ◽

Cited By ~ 2

Author(s):

Christos Palaiokostas ◽

Shannon M. Clarke ◽

Henrik Jeuthe ◽

Rudiger Brauning ◽

Timothy P. Bilton ◽

...

Keyword(s):

Principal Components ◽

Salvelinus Alpinus ◽

Arctic Charr ◽

Genotyping By Sequencing ◽

Breeding Value ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Value Estimation ◽

Low Coverage

Arctic charr (Salvelinus alpinus) is a species of high economic value for the aquaculture industry, and of high ecological value due to its Holarctic distribution in both marine and freshwater environments. Novel genome sequencing approaches enable the study of population and quantitative genetic parameters even on species with limited or no prior genomic resources. Low coverage genotyping by sequencing (GBS) was applied in a selected strain of Arctic charr in Sweden originating from a landlocked freshwater population. For the needs of the current study, animals from year classes 2013 (171 animals, parental population) and 2017 (759 animals; 13 full sib families) were used as a template for identifying genome wide single nucleotide polymorphisms (SNPs). GBS libraries were constructed using the PstI and MspI restriction enzymes. Approximately 14.5K SNPs passed quality control and were used for estimating a genomic relationship matrix. Thereafter a wide range of analyses were conducted in order to gain insights regarding genetic diversity and investigate the efficiency of the genomic information for parentage assignment and breeding value estimation. Heterozygosity estimates for both year classes suggested a slight excess of heterozygotes. Furthermore, FST estimates among the families of year class 2017 ranged between 0.009 – 0.066. Principal components analysis (PCA) and discriminant analysis of principal components (DAPC) were applied aiming to identify the existence of genetic clusters among the studied population. Results obtained were in accordance with pedigree records allowing the identification of individual families. Additionally, DNA parentage verification was performed, with results in accordance with the pedigree records with the exception of a putative dam where full sib genotypes suggested a potential recording error. Breeding value estimation for juvenile growth through the usage of the estimated genomic relationship matrix clearly outperformed the pedigree equivalent in terms of prediction accuracy (0.51 opposed to 0.31). Overall, low coverage GBS has proven to be a cost-effective genotyping platform that is expected to boost the selection efficiency of the Arctic charr breeding program.

Download Full-text

335 Genomic predictions with a multi-breed genomic relationship matrix

Journal of Animal Science ◽

10.1093/jas/skz258.099 ◽

2019 ◽

Vol 97 (Supplement_3) ◽

pp. 49-50

Author(s):

Yvette Steyn ◽

Daniela Lourenco ◽

Ignacy Misztal

Keyword(s):

Prediction Accuracy ◽

Negative Impact ◽

Reference Population ◽

Single Step ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Effective Population ◽

Specific Allele ◽

Missing Genotypes

Abstract Multi-breed evaluations have the advantage of increasing the size of the reference population for genomic evaluations and are quite simple; however, combining breeds usually have a negative impact on prediction accuracy. The aim of this study was to evaluate the use of a multi-breed genomic relationship matrix (G), where SNP for each breed are non-shared. The multi-breed G is set assuming known genotypes for one breed and missing genotypes for the remaining breeds. This setup may avoid spurious IBS relationships between breeds and considers breed-specific allele frequencies. This scenario was contrasted to multi-breed evaluations where all SNP are shared, i.e., the same SNP, and to single-breed evaluations. Different SNP densities, namely 9k and 45k, and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that QTL effects were the same over all breeds. For the recent population, generations 1 to 9 had approximately half of the animals genotyped, whereas all 1200 animals were genotyped in generation 10. Genotyped animals in generation 10 were set as validation; therefore, each breed had a validation set. Analysis were performed using single-step GBLUP (ssGBLUP). Prediction accuracy was calculated as correlation between true (T) and genomic estimated (GE) BV. Accuracies of GEBV were lower for the larger Ne and low SNP density. All three scenarios using 45K resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multi-breed evaluation using 9K resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.11 for a larger Ne. This loss was mostly avoided when markers were treated as non-shared within the same genomic relationship matrix.

Download Full-text

A novel linkage-disequilibrium corrected genomic relationship matrix for SNP-heritability estimation and genomic prediction

Heredity ◽

10.1038/s41437-017-0023-4 ◽

2017 ◽

Vol 120 (4) ◽

pp. 356-368 ◽

Cited By ~ 12

Author(s):

Boby Mathew ◽

Jens Léon ◽

Mikko J. Sillanpää

Keyword(s):

Linkage Disequilibrium ◽

Genomic Prediction ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Heritability Estimation

Download Full-text

Genomic Heritability: A Ragged Diagonal Between Bias and Variance

10.1101/2021.09.19.460999 ◽

2021 ◽

Author(s):

Mitchell J. Feldmann ◽

Hans-Peter Piepho ◽

Steven J. Knapp

Keyword(s):

Mixed Model ◽

Dna Polymorphisms ◽

Breeding Value ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Model Framework ◽

Kinship Matrix ◽

Genomic Heritability ◽

A Genome

Many important traits in plants, animals, and microbes are polygenic and are therefore difficult to improve through traditional marker?assisted selection. Genomic prediction addresses this by enabling the inclusion of all genetic data in a mixed model framework. The main method for predicting breeding values is genomic best linear unbiased prediction (GBLUP), which uses the realized genomic relationship or kinship matrix (K) to connect genotype to phenotype. The use of relationship matrices allows information to be shared for estimating the genetic values for observed entries and predicting genetic values for unobserved entries. One of the key parameters of such models is genomic heritability (h2g), or the variance of a trait associated with a genome-wide sample of DNA polymorphisms. Here we discuss the relationship between several common methods for calculating the genomic relationship matrix and propose a new matrix based on the average semivariance that yields accurate estimates of genomic variance in the observed population regardless of the focal population quality as well as accurate breeding value predictions in unobserved samples. Notably, our proposed method is highly similar to the approach presented by Legarra (2016) despite different mathematical derivations and statistical perspectives and only deviates from the classic approach presented in VanRaden (2008) by a scaling factor. With current approaches, we found that the genomic heritability tends to be either over- or underestimated depending on the scaling and centering applied to the marker matrix (Z), the value of the average diagonal element of K, and the assortment of alleles and heterozygosity (H) in the observed population and that, unlike its predecessors, our newly proposed kinship matrix KASV yields accurate estimates of h2g in the observed population, generalizes to larger populations, and produces BLUPs equivalent to common methods in plants and animals.

Download Full-text

Utility of Climatic Information via Combining Ability Models to Improve Genomic Prediction for Yield Within the Genomes to Fields Maize Project

Frontiers in Genetics ◽

10.3389/fgene.2020.592769 ◽

2021 ◽

Vol 11 ◽

Author(s):

Diego Jarquin ◽

Natalia de Leon ◽

Cinta Romay ◽

Martin Bohn ◽

Edward S. Buckler ◽

...

Keyword(s):

Genomic Prediction ◽

Combining Ability ◽

Prediction Models ◽

Predictive Ability ◽

Weather Data ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Environment Interaction ◽

Environmental Covariates ◽

Genotype By Environment

Genomic prediction provides an efficient alternative to conventional phenotypic selection for developing improved cultivars with desirable characteristics. New and improved methods to genomic prediction are continually being developed that attempt to deal with the integration of data types beyond genomic information. Modern automated weather systems offer the opportunity to capture continuous data on a range of environmental parameters at specific field locations. In principle, this information could characterize training and target environments and enhance predictive ability by incorporating weather characteristics as part of the genotype-by-environment (G×E) interaction component in prediction models. We assessed the usefulness of including weather data variables in genomic prediction models using a naïve environmental kinship model across 30 environments comprising the Genomes to Fields (G2F) initiative in 2014 and 2015. Specifically four different prediction scenarios were evaluated (i) tested genotypes in observed environments; (ii) untested genotypes in observed environments; (iii) tested genotypes in unobserved environments; and (iv) untested genotypes in unobserved environments. A set of 1,481 unique hybrids were evaluated for grain yield. Evaluations were conducted using five different models including main effect of environments; general combining ability (GCA) effects of the maternal and paternal parents modeled using the genomic relationship matrix; specific combining ability (SCA) effects between maternal and paternal parents; interactions between genetic (GCA and SCA) effects and environmental effects; and finally interactions between the genetics effects and environmental covariates. Incorporation of the genotype-by-environment interaction term improved predictive ability across all scenarios. However, predictive ability was not improved through inclusion of naive environmental covariates in G×E models. More research should be conducted to link the observed weather conditions with important physiological aspects in plant development to improve predictive ability through the inclusion of weather data.

Download Full-text

Simulation of pedigree vs. fully-informative marker based relationships matrices in a loblolly pine breeding population

10.1101/2021.11.30.468863 ◽

2021 ◽

Author(s):

Adam R Festa ◽

Ross Whetten

Keyword(s):

Inbreeding Depression ◽

Loblolly Pine ◽

Genetic Gain ◽

Field Experiments ◽

Breeding Value ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Deleterious Alleles

Computer simulations of breeding strategies are an essential resource for tree breeders because they allow exploratory analyses into potential long-term impacts on genetic gain and inbreeding consequences without bearing the cost, time, or resource requirements of field experiments. Previous work has modeled the potential long-term implications on inbreeding and genetic gain using random mating and phenotypic selection. Reduction in sequencing costs has enabled the use of DNA marker-based relationship matrices in addition to or in place of pedigree-based allele sharing estimates; this has been shown to provide a significant increase in the accuracy of progeny breeding value prediction. A potential pitfall of genomic selection using genetic relationship matrices is increased coancestry among selections, leading to the accumulation of deleterious alleles and inbreeding depression. We used simulation to compare the relative genetic gain and risk of inbreeding depression within a breeding program similar to loblolly pine, utilizing pedigree-based or marker-based relationships over ten generations. We saw a faster rate of purging deleterious alleles when using a genomic relationship matrix based on markers that track identity-by-descent of segments of the genome. Additionally, we observed an increase in the rate of genetic gain when using a genomic relationship matrix instead of a pedigree-based relationship matrix. While the genetic variance of populations decreased more rapidly when using genomic-based relationship matrices as opposed to pedigree-based, there appeared to be no long-term consequences on the accumulation of deleterious alleles within the simulated breeding strategy.

Download Full-text

Comparison of Breeding Value by Establishment of Genomic Relationship Matrix in Pure Landrace Population

Journal of Animal Science and Technology ◽

10.5187/jast.2013.55.3.165 ◽

2013 ◽

Vol 55 (3) ◽

pp. 165-171

Author(s):

Joon-Ho Lee ◽

Kwang-Hyun Cho ◽

Chung-Il Cho ◽

Kyung-Do Park ◽

Deuk Hwan Lee

Keyword(s):

Breeding Value ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Genomic Relationship ◽

Landrace Population

Download Full-text