Application of single-step GBLUP in New Zealand Romney sheep

2020 ◽  
Vol 60 (9) ◽  
pp. 1136
Author(s):  
M. A. Nilforooshan

Context In New Zealand, Romney is the most predominant breed and is reared as a dual-purpose sheep. The number of genotypes is rapidly increasing in the sheep population, and making use of both genotypes and pedigree information is of importance for genetic evaluations. Single-step genomic best linear unbiased prediction (ssGBLUP) is a method for simultaneous prediction of genetic merits for genotyped and non-genotyped animals. The combination and the compatibility of the genomic relationship matrix (G) and the pedigree relationship matrix for genotyped animals (A22) is important for unbiased ssGBLUP. Aims The aim of the present study was to find an optimum genetic relationship matrix for ssGBLUP weaning-weight evaluation of Romney sheep in New Zealand. Methods Data consisted of adjusted weaning weights for 2422011 sheep, 50K single-nucleotide polymorphism genotypes for 13304 animals and 3028688 animals in the pedigree. Blending of G and A22 was tested with weights (k) ranging from 0.2 to 0.99 (kG + (1 – k)A22), followed by none or one of the three methods of tuning G to A22. Key results The averages of G and A22 were close to each other for overall, diagonal and off-diagonal elements. Therefore, differently tuned G performed similarly. However, elements of G showed larger variation than did the elements of A22 and, on average, genotyped animals were less related in G than in A22. Correlations between genomic estimated breeding values (GEBV) for the top 500 genotyped animals, as well as the rank correlations, were almost 1 among ssGBLUP evaluations using tuned G. The corresponding correlations with BLUP evaluations were increased by blending G with a larger proportion of A22, and were further increased by tuning G, indicating improved compatibility between G and A22. Blending and tuning G suppressed the inflation of GEBV and bias and it moved the genetic trend closer to the genetic trend obtained from BLUP. Conclusions A combination of blending and tuning G to A22, with a blending rate of 0.5 at most, is recommended for weaning weight of Romney sheep in New Zealand. Failure to do that resulted in inflated GEBV that can reduce the accuracy of selection, especially for genotyped animals. Implications There is a growing interest in the single-step GBLUP method for simultaneous genetic evaluation of genotyped and non-genotyped animals, in which genomic and pedigree relationship matrices are admixed. Using data from New Zealand Romney sheep, we have shown that adjustment of the genomic relationship matrix on the basis of the pedigree relationship matrix is necessary to avoid inflated evaluations. Improving the compatibility between genomic and pedigree relationship matrices is important for obtaining accurate and unbiased single-step GBLUP evaluations.

2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 49-49
Author(s):  
Andre Garcia ◽  
Yutaka Masuda ◽  
Stephen P Miller ◽  
Ignacy Misztal ◽  
Daniela Lourenco

Abstract With the increasing number of genotyped animals, the algorithm for proven and young (APY) can be used to compute the inverse of the genomic relationship matrix (G-1apy) in genomic BLUP (GBLUP) and single-step GBLUP (ssGBLUP). This algorithm also allows the use of all genotyped animals to calculate SNP effects from genomic EBV (GEBV), which can then be used to obtain indirect predictions (IP) for interim evaluations, or as genomic prediction for animals not included in official evaluations. The objective of the study was to evaluate the quality of IP from GBLUP with increasing number of genotyped animals. Birth weight, weaning weight and post-wearing gain phenotypes and genotypes were provided by the American Angus Association. Phenotypes and genotypes were divided in 3 scenarios based on birth year: genotyped animals born up to 2013 (114,937), 2014 (183,847) and 2015 (280,506). A 3-trait model was fit and GBLUP with APY was used to calculate GEBV and SNP effects. To calculate G-1apy, 19,021 core animals were randomly sampled from animals born up to 2013. Core animals remained the same, whereas the number of non-core animals increased as more genotyped animals were added. Additional analyses had updated core animals for each scenario. SNP effects were also calculated based on G-1apy and G-1 only for core animals (G-1core). IP were computed for all animals in each scenario by multiplying the centered genotypes by the SNP effects. To access the quality of IP, correlation between IP and GEBV was calculated. The Correlations were greater than 0.99 for all traits in all scenarios. Despite the increase of non-core animals in APY, GEBV were successfully retrieved from SNP effects using IP. When SNP effects were calculated based on G-1core, updating the core animals as the number of genotyped animals increase seems to be the best choice.


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 49-50
Author(s):  
Yvette Steyn ◽  
Daniela Lourenco ◽  
Ignacy Misztal

Abstract Multi-breed evaluations have the advantage of increasing the size of the reference population for genomic evaluations and are quite simple; however, combining breeds usually have a negative impact on prediction accuracy. The aim of this study was to evaluate the use of a multi-breed genomic relationship matrix (G), where SNP for each breed are non-shared. The multi-breed G is set assuming known genotypes for one breed and missing genotypes for the remaining breeds. This setup may avoid spurious IBS relationships between breeds and considers breed-specific allele frequencies. This scenario was contrasted to multi-breed evaluations where all SNP are shared, i.e., the same SNP, and to single-breed evaluations. Different SNP densities, namely 9k and 45k, and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that QTL effects were the same over all breeds. For the recent population, generations 1 to 9 had approximately half of the animals genotyped, whereas all 1200 animals were genotyped in generation 10. Genotyped animals in generation 10 were set as validation; therefore, each breed had a validation set. Analysis were performed using single-step GBLUP (ssGBLUP). Prediction accuracy was calculated as correlation between true (T) and genomic estimated (GE) BV. Accuracies of GEBV were lower for the larger Ne and low SNP density. All three scenarios using 45K resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multi-breed evaluation using 9K resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.11 for a larger Ne. This loss was mostly avoided when markers were treated as non-shared within the same genomic relationship matrix.


Author(s):  
Alban Bouquet ◽  
Mikko Sillanpää ◽  
Jarmo Juga

The aim of this simulation study was to compare the accuracy and bias of different inbreeding (F) estimators exploiting dense panels of diallelic markers and pedigree information. All genotype simulations were started by generating an ancestral population at mutation-drift equilibrium considering an effective size of 1000 and a mutation rate (µ) of 5.10-4. Two types of subpopulation were derived from the ancestral population for 10 discrete generations. They differed by the level of selection applied both on males and females: no selection or a structure close to a breeding program with selection of the best 40 males and 500 females on EBV with accuracy of 0.85 and 0.71, respectively, on a trait with heritability of 0.3. Marker panels were made up of 36 000 biallelic markers (18 per cM) and were available for animals in the last 4 generations. Pedigrees were recorded on the last 8 generations. For each scenario, 30 replicates were carried out. Analysed estimators were the correlation (VR1) and regression (VR3) estimators described to build the genomic relationship matrix by VanRaden in 2008. Other estimators included the weighted corrected similarity (WCS) estimator published by Ritland in 1996 and a modified WCS estimator accounting for pedigree information (WPCS). Pedigree-based inbreeding (PED) was also estimated using exhaustive pedigree information. Inbreeding estimates were correlated and regressed to the true simulated genomic F values to assess the precision and bias of estimators, respectively. Main results show that use of dense marker information improves the estimation of F, whatever the scenario. The accuracy of F estimates and the bias were increased in presence of selection, except for PED. Across scenarios, VR3, WCS and WPCS were the most correlated with true F values. In the situation where pedigree was exhaustive, VR3 performed as well as WCS and WPCS but had a larger variability over replicates. Although less biased on average, VR1 was less accurate than other estimators especially when allele frequencies were not properly defined. Accounting for pedigree information into WCS did not increase its estimation accuracy and did not reduce bias in the tested scenarios. Finally, error in estimating inbreeding trends over time in selected populations was greater for some marker-based estimators (VR3, VR1) than PED estimator. WCS and WPCS rendered the most accurate estimations of inbreeding trends. Thus, results indicate that WCS, which can be also used with multiallelic markers, is a promising estimator both to build the genomic relationship matrix for genomic evaluations and to better assess genetic diversity in selected populations.


2016 ◽  
Author(s):  
Carolina Andrea Garcia-Baccino ◽  
Andres Legarra ◽  
Ole F Christensen ◽  
Ignacy Misztal ◽  
Ivan Pocrnic ◽  
...  

ABSTRACTBACKGROUNDMetafounders are pseudo-individuals that condense the genetic heterozygosity and relationships within and across base pedigree populations, i.e. ancestral populations. This work addresses estimation and usefulness of metafounder relationships in Single Step GBLUP.RESULTSWe show that the ancestral relationship parameters are proportional to standardized covariances of base allelic frequencies across populations, like Fst fixation indexes. These covariances of base allelic frequencies can be estimated from marker genotypes of related recent individuals, and pedigree. Simple methods for estimation include naïve computation of allele frequencies from marker genotypes or a method of moments equating average pedigree-based and marker-based relationships. Complex methods include generalized least squares or maximum likelihood based on pedigree relationships. To our knowledge, methods to infer Fst coefficients and Fst differentiation have not been developed for related populations.A compatible genomic relationship matrix constructed as a crossproduct of {−1,0,1} codes, and equivalent (up to scale factors) to an identity by state relationship matrix at the markers, is derived. Using a simulation with a single population under selection, in which only males and youngest animals were genotyped, we observed that generalized least squares or maximum likelihood gave accurate and unbiased estimates of the ancestral relationship parameter (true value: 0.40) whereas the other two (naïve and method of moments) were biased (estimates of 0.43 and 0.35). We also observed that genomic evaluation by Single Step GBLUP using metafounders was less biased in terms of accurate genetic trend (0.01 instead of 0.12 bias), slightly overdispersed (0.94 instead of 0.99) and as accurate (0.74) than the regular Single Step GBLUP. Single Step GBLUP using metafounders also provided consistent estimates of heritability.CONCLUSIONSEstimation of metafounder relationship can be achieved using BLUP-like methods with pedigree and markers. Inclusion of metafounder relationships improves bias of genomic predictions with no loss in accuracy.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 6-7
Author(s):  
Andre Garcia ◽  
Ignacio Aguilar ◽  
Andres Legarra ◽  
Stephen P Miller ◽  
Shogo Tsuruta ◽  
...  

Abstract With an ever-increasing number of genotyped animals, there is a question of whether to include all genotypes into single-step GBLUP (ssGBLUP) evaluations or to include only genotyped animals with phenotypes and use indirect predictions (IP) for the remaining young genotyped animals. Under ssGBLUP, SNP effects can be backsolved from GEBV, and IP can be calculated as the sum of SNP effects weighted by the gene content. To publish IP, a measure of accuracy that reflects the standard error of prediction, and that is comparable to GEBV accuracy, is needed. Our first objective was to test formulas to compute accuracy of IP by backsolving prediction error covariance (PEC) of GEBV into PEC of SNP effects. The second objective was to investigate the number of genotyped animals needed to obtain robust IP accuracy. Data were provided by the American Angus Association, with 38,000 post-weaning gain phenotypes and 60,000 genotyped animals. Correlations between GEBV and IP were ≥0.99. When all genotyped animals were used for PEC computations, accuracy correlations were also ≥0.99. Additionally, GEBV and IP accuracies were compatible, with both direct inversion of the genomic relationship matrix (G) or using the algorithm for proven and young (APY) to obtain G inverse. As the number of genotyped animals in PEC computations decreased to 15,000, accuracy correlations were still high (≥0.96), but IP accuracies were biased downwards. Indirect prediction accuracy can be successfully obtained from ssGBLUP without running an extra SNP-BLUP evaluation to compute SNP PEC. It is possible to reduce the number of genotyped animals in PEC computations, but accuracies may be slightly underestimated. When the amount of genomic and phenotypic data is large, the polygenic part of GEBV becomes small and IP can be very accurate. Further research is needed to approximate SNP PEC with a large number of genotyped animals.


2019 ◽  
Vol 97 (11) ◽  
pp. 4418-4427 ◽  
Author(s):  
Yvette Steyn ◽  
Daniela A L Lourenco ◽  
Ignacy Misztal

Abstract Combining breeds in a multibreed evaluation can have a negative impact on prediction accuracy, especially if single nucleotide polymorphism (SNP) effects differ among breeds. The aim of this study was to evaluate the use of a multibreed genomic relationship matrix (G), where SNP effects are considered to be unique to each breed, that is, nonshared. This multibreed G was created by treating SNP of different breeds as if they were on nonoverlapping positions on the chromosome, although, in reality, they were not. This simple setup may avoid spurious Identity by state (IBS) relationships between breeds and automatically considers breed-specific allele frequencies. This scenario was contrasted to a regular multibreed evaluation where all SNPs were shared, that is, the same position, and to single-breed evaluations. Different SNP densities (9k and 45k) and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that quantitative trait locus (QTL) effects were the same over all breeds. For the recent population, generations 1–9 had approximately half of the animals genotyped, whereas all animals in generation 10 were genotyped. Generation 10 animals were set for validation; therefore, each breed had a validation group. Analyses were performed using single-step genomic best linear unbiased prediction. Prediction accuracy was calculated as the correlation between true (T) and genomic estimated breeding values (GEBV). Accuracies of GEBV were lower for the larger Ne and low SNP density. All three evaluation scenarios using 45k resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multibreed evaluation using 9k resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.12 for a larger Ne. This loss was mostly avoided when markers were treated as nonshared within the same G matrix. A G matrix with nonshared SNP enables multibreed evaluations without considerably changing accuracy, especially with limited information per breed.


Animals ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 3234
Author(s):  
José Cortes-Hernández ◽  
Adriana García-Ruiz ◽  
Carlos Gustavo Vásquez-Peláez ◽  
Felipe de Jesus Ruiz-Lopez

This study aimed to identify inbreeding coefficient (F) estimators useful for improvement programs in a small Holstein population through the evaluation of different methodologies in the Mexican Holstein population. F was estimated as follows: (a) from pedigree information (Fped); (b) through runs of homozygosity (Froh); (c) from the number of observed and expected homozygotic SNP in the individuals (Fgeno); (d) through the genomic relationship matrix (Fmg). The study included information from 4277 animals with pedigree records and 100,806 SNP. The average and standard deviation values of F were 3.11 ± 2.30 for Fped, −0.02 ± 3.55 for Fgeno, 2.77 ± 0.71 for Froh and 3.03 ± 3.05 for Fmg. The correlations between coefficients varied from 0.30 between Fped and Froh, to 0.96 between Fgeno and Fmg. Differences in the level of inbreeding among the parent’s country of origin were found regardless of the method used. The correlations among genomic inbreeding coefficients were high; however, they were low with Fped, so further research on this topic is required.


2021 ◽  
Vol 99 (2) ◽  
Author(s):  
Yutaka Masuda ◽  
Shogo Tsuruta ◽  
Matias Bermann ◽  
Heather L Bradford ◽  
Ignacy Misztal

Abstract Pedigree information is often missing for some animals in a breeding program. Unknown-parent groups (UPGs) are assigned to the missing parents to avoid biased genetic evaluations. Although the use of UPGs is well established for the pedigree model, it is unclear how UPGs are integrated into the inverse of the unified relationship matrix (H-inverse) required for single-step genomic best linear unbiased prediction. A generalization of the UPG model is the metafounder (MF) model. The objectives of this study were to derive 3 H-inverses and to compare genetic trends among models with UPG and MF H-inverses using a simulated purebred population. All inverses were derived using the joint density function of the random breeding values and genetic groups. The breeding values of genotyped animals (u2) were assumed to be adjusted for UPG effects (g) using matrix Q2 as u2∗=u2+Q2g before incorporating genomic information. The Quaas–Pollak-transformed (QP) H-inverse was derived using a joint density function of u2∗ and g updated with genomic information and assuming nonzero cov(u2∗,g′). The modified QP (altered) H-inverse also assumes that the genomic information updates u2∗ and g, but cov(u2∗,g′)=0. The UPG-encapsulated (EUPG) H-inverse assumed genomic information updates the distribution of u2∗. The EUPG H-inverse had the same structure as the MF H-inverse. Fifty percent of the genotyped females in the simulation had a missing dam, and missing parents were replaced with UPGs by generation. The simulation study indicated that u2∗ and g in models using the QP and altered H-inverses may be inseparable leading to potential biases in genetic trends. Models using the EUPG and MF H-inverses showed no genetic trend biases. These 2 H-inverses yielded the same genomic EBV (GEBV). The predictive ability and inflation of GEBVs from young genotyped animals were nearly identical among models using the QP, altered, EUPG, and MF H-inverses. Although the choice of H-inverse in real applications with enough data may not result in biased genetic trends, the EUPG and MF H-inverses are to be preferred because of theoretical justification and possibility to reduce biases.


Sign in / Sign up

Export Citation Format

Share Document