scholarly journals Metafounders are Fst fixation indices and reduce bias in single step genomic evaluations

2016 ◽  
Author(s):  
Carolina Andrea Garcia-Baccino ◽  
Andres Legarra ◽  
Ole F Christensen ◽  
Ignacy Misztal ◽  
Ivan Pocrnic ◽  
...  

ABSTRACTBACKGROUNDMetafounders are pseudo-individuals that condense the genetic heterozygosity and relationships within and across base pedigree populations, i.e. ancestral populations. This work addresses estimation and usefulness of metafounder relationships in Single Step GBLUP.RESULTSWe show that the ancestral relationship parameters are proportional to standardized covariances of base allelic frequencies across populations, like Fst fixation indexes. These covariances of base allelic frequencies can be estimated from marker genotypes of related recent individuals, and pedigree. Simple methods for estimation include naïve computation of allele frequencies from marker genotypes or a method of moments equating average pedigree-based and marker-based relationships. Complex methods include generalized least squares or maximum likelihood based on pedigree relationships. To our knowledge, methods to infer Fst coefficients and Fst differentiation have not been developed for related populations.A compatible genomic relationship matrix constructed as a crossproduct of {−1,0,1} codes, and equivalent (up to scale factors) to an identity by state relationship matrix at the markers, is derived. Using a simulation with a single population under selection, in which only males and youngest animals were genotyped, we observed that generalized least squares or maximum likelihood gave accurate and unbiased estimates of the ancestral relationship parameter (true value: 0.40) whereas the other two (naïve and method of moments) were biased (estimates of 0.43 and 0.35). We also observed that genomic evaluation by Single Step GBLUP using metafounders was less biased in terms of accurate genetic trend (0.01 instead of 0.12 bias), slightly overdispersed (0.94 instead of 0.99) and as accurate (0.74) than the regular Single Step GBLUP. Single Step GBLUP using metafounders also provided consistent estimates of heritability.CONCLUSIONSEstimation of metafounder relationship can be achieved using BLUP-like methods with pedigree and markers. Inclusion of metafounder relationships improves bias of genomic predictions with no loss in accuracy.

2020 ◽  
Vol 60 (9) ◽  
pp. 1136
Author(s):  
M. A. Nilforooshan

Context In New Zealand, Romney is the most predominant breed and is reared as a dual-purpose sheep. The number of genotypes is rapidly increasing in the sheep population, and making use of both genotypes and pedigree information is of importance for genetic evaluations. Single-step genomic best linear unbiased prediction (ssGBLUP) is a method for simultaneous prediction of genetic merits for genotyped and non-genotyped animals. The combination and the compatibility of the genomic relationship matrix (G) and the pedigree relationship matrix for genotyped animals (A22) is important for unbiased ssGBLUP. Aims The aim of the present study was to find an optimum genetic relationship matrix for ssGBLUP weaning-weight evaluation of Romney sheep in New Zealand. Methods Data consisted of adjusted weaning weights for 2422011 sheep, 50K single-nucleotide polymorphism genotypes for 13304 animals and 3028688 animals in the pedigree. Blending of G and A22 was tested with weights (k) ranging from 0.2 to 0.99 (kG + (1 – k)A22), followed by none or one of the three methods of tuning G to A22. Key results The averages of G and A22 were close to each other for overall, diagonal and off-diagonal elements. Therefore, differently tuned G performed similarly. However, elements of G showed larger variation than did the elements of A22 and, on average, genotyped animals were less related in G than in A22. Correlations between genomic estimated breeding values (GEBV) for the top 500 genotyped animals, as well as the rank correlations, were almost 1 among ssGBLUP evaluations using tuned G. The corresponding correlations with BLUP evaluations were increased by blending G with a larger proportion of A22, and were further increased by tuning G, indicating improved compatibility between G and A22. Blending and tuning G suppressed the inflation of GEBV and bias and it moved the genetic trend closer to the genetic trend obtained from BLUP. Conclusions A combination of blending and tuning G to A22, with a blending rate of 0.5 at most, is recommended for weaning weight of Romney sheep in New Zealand. Failure to do that resulted in inflated GEBV that can reduce the accuracy of selection, especially for genotyped animals. Implications There is a growing interest in the single-step GBLUP method for simultaneous genetic evaluation of genotyped and non-genotyped animals, in which genomic and pedigree relationship matrices are admixed. Using data from New Zealand Romney sheep, we have shown that adjustment of the genomic relationship matrix on the basis of the pedigree relationship matrix is necessary to avoid inflated evaluations. Improving the compatibility between genomic and pedigree relationship matrices is important for obtaining accurate and unbiased single-step GBLUP evaluations.


1988 ◽  
Vol 25 (3) ◽  
pp. 301-307
Author(s):  
Wilfried R. Vanhonacker

Estimating autoregressive current effects models is not straightforward when observations are aggregated over time. The author evaluates a familiar iterative generalized least squares (IGLS) approach and contrasts it to a maximum likelihood (ML) approach. Analytic and numerical results suggest that (1) IGLS and ML provide good estimates for the response parameters in instances of positive serial correlation, (2) ML provides superior (in mean squared error) estimates for the serial correlation coefficient, and (3) IGLS might have difficulty in deriving parameter estimates in instances of negative serial correlation.


2020 ◽  
Vol 2020 ◽  
pp. 1-15
Author(s):  
David Adedia ◽  
Atinuke O. Adebanji ◽  
Simon Kojo Appiah

This study compared a ridge maximum likelihood estimator to Yuan and Chan (2008) ridge maximum likelihood, maximum likelihood, unweighted least squares, generalized least squares, and asymptotic distribution-free estimators in fitting six models that show relationships in some noncommunicable diseases. Uncontrolled hypertension has been shown to be a leading cause of coronary heart disease, kidney dysfunction, and other negative health outcomes. It poses equal danger when asymptomatic and undetected. Research has also shown that it tends to coexist with diabetes mellitus (DM), with the presence of DM doubling the risk of hypertension. The study assessed the effect of obesity, type II diabetes, and hypertension on coronary risk and also the existence of converse relationship with structural equation modelling (SEM). The results showed that the two ridge estimators did better than other estimators. Nonconvergence occurred for most of the models for asymptotic distribution-free estimator and unweighted least squares estimator whilst generalized least squares estimator had one nonconvergence of results. Other estimators provided competing outputs, but unweighted least squares estimator reported unreliable parameter estimates such as large chi-square test statistic and root mean square error of approximation for Model 3. The maximum likelihood family of estimators did better than others like asymptotic distribution-free estimator in terms of overall model fit and parameter estimation. Also, the study found that increase in obesity could result in a significant increase in both hypertension and coronary risk. Diastolic blood pressure and diabetes have significant converse effects on each other. This implies those who are hypertensive can develop diabetes and vice versa.


1983 ◽  
Vol 13 (4) ◽  
pp. 387-404 ◽  
Author(s):  
G. J. Huba ◽  
L. L. Harlow

Latent variable causal modeling techniques are sometimes criticized when applied to drug abuse data because the commonly-employed maximum likelihood parameter estimation method requires that the data be normally distributed for the statistical tests to be accurate. In this article, four estimators for the parameters in two large latent variable causal models are compared in real drug abuse datasets. One estimator does not require that the data be multivariate normal and does, in fact, correct for data non-normality. Specifically, maximum likelihood and generalized least squares estimators for normally-distributed variables are compared with Browne's asymptotically distribution free techniques for continuous non-normally distributed data. Additionally, ordinary (unweighted) least squares estimates are used. Descriptions of the techniques are given and actual results in two “real” datasets are provided. It is concluded that the distribution free technique provides results which are generally comparable to those obtained with maximum likelihood estimation for datasets which depart in typical ways from the ideal of the multivariate normal distribution.


2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Øyvind Nordbø ◽  
Arne B. Gjuvsland ◽  
Leiv Sigbjørn Eikje ◽  
Theo Meuwissen

Abstract Background The main aim of single-step genomic predictions was to facilitate optimal selection in populations consisting of both genotyped and non-genotyped individuals. However, in spite of intensive research, biases still occur, which make it difficult to perform optimal selection across groups of animals. The objective of this study was to investigate whether incomplete genotype datasets with errors could be a potential source of level-bias between genotyped and non-genotyped animals and between animals genotyped on different single nucleotide polymorphism (SNP) panels in single-step genomic predictions. Results Incomplete and erroneous genotypes of young animals caused biases in breeding values between groups of animals. Systematic noise or missing data for less than 1% of the SNPs in the genotype data had substantial effects on the differences in breeding values between genotyped and non-genotyped animals, and between animals genotyped on different chips. The breeding values of young genotyped individuals were biased upward, and the magnitude was up to 0.8 genetic standard deviations, compared with breeding values of non-genotyped individuals. Similarly, the magnitude of a small value added to the diagonal of the genomic relationship matrix affected the level of average breeding values between groups of genotyped and non-genotyped animals. Cross-validation accuracies and regression coefficients were not sensitive to these factors. Conclusions Because, historically, different SNP chips have been used for genotyping different parts of a population, fine-tuning of imputation within and across SNP chips and handling of missing genotypes are crucial for reducing bias. Although all the SNPs used for estimating breeding values are present on the chip used for genotyping young animals, incompleteness and some genotype errors might lead to level-biases in breeding values.


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 49-50
Author(s):  
Yvette Steyn ◽  
Daniela Lourenco ◽  
Ignacy Misztal

Abstract Multi-breed evaluations have the advantage of increasing the size of the reference population for genomic evaluations and are quite simple; however, combining breeds usually have a negative impact on prediction accuracy. The aim of this study was to evaluate the use of a multi-breed genomic relationship matrix (G), where SNP for each breed are non-shared. The multi-breed G is set assuming known genotypes for one breed and missing genotypes for the remaining breeds. This setup may avoid spurious IBS relationships between breeds and considers breed-specific allele frequencies. This scenario was contrasted to multi-breed evaluations where all SNP are shared, i.e., the same SNP, and to single-breed evaluations. Different SNP densities, namely 9k and 45k, and different effective population sizes (Ne) were tested. Five breeds mimicking recent beef cattle populations that diverged from the same historical population were simulated using different selection criteria. It was assumed that QTL effects were the same over all breeds. For the recent population, generations 1 to 9 had approximately half of the animals genotyped, whereas all 1200 animals were genotyped in generation 10. Genotyped animals in generation 10 were set as validation; therefore, each breed had a validation set. Analysis were performed using single-step GBLUP (ssGBLUP). Prediction accuracy was calculated as correlation between true (T) and genomic estimated (GE) BV. Accuracies of GEBV were lower for the larger Ne and low SNP density. All three scenarios using 45K resulted in similar accuracies, suggesting that the marker density is high enough to account for relationships and linkage disequilibrium with QTL. A shared multi-breed evaluation using 9K resulted in a decrease of accuracy of 0.08 for a smaller Ne and 0.11 for a larger Ne. This loss was mostly avoided when markers were treated as non-shared within the same genomic relationship matrix.


Sign in / Sign up

Export Citation Format

Share Document