Prediction of Breeding Values with a Mixed Model with Heterogeneous Variances for Large-Scale Dairy Data

Bas Engel; Theo Meuwissen; Gerben de Jong; Willem Buist

doi:10.2307/1400596

A Shortcut Approach for Large-scale Mixed Model Associations with Binary Traits

10.21203/rs.3.rs-312421/v1 ◽

2021 ◽

Author(s):

Runqing Yang ◽

Jun Bao ◽

Runqing Yang ◽

Yuxin Song ◽

Zhiyu Hao ◽

...

Keyword(s):

Quantitative Trait ◽

Multiple Testing ◽

Statistical Power ◽

Large Scale ◽

Mixed Model ◽

Joint Analysis ◽

Genomic Breeding ◽

Breeding Values ◽

Genomic Control ◽

Genomic Heritability

Abstract Generalized linear mixed models exhibit computationally intensive and biasness in mapping quantitative trait nucleotides for binary diseases. In genomic logit regression, we consider genomic breeding values estimated in advance as a known predictor, and then correct the deflated association test statistics by using genomic control, thereby successfully extending GRAMMAR-Lambda to analyze binary diseases in a complex structured population. Because there is no need to estimate genomic heritability and genomic breeding values can be estimated by a small number of sampling markers, the generalized mixed-model association analysis has been extremely simplified to handle large-scale data. With almost perfect genomic control, joint analysis for the candidate quantitative trait nucleotides chosen by multiple testing offered a significant improvement in statistical power.

Download Full-text

Bayesian estimation of a surface to account for a spatial trend using penalized splines in an individual-tree mixed model

Canadian Journal of Forest Research ◽

10.1139/x07-116 ◽

2007 ◽

Vol 37 (12) ◽

pp. 2677-2688 ◽

Cited By ~ 15

Author(s):

Eduardo P. Cappa ◽

Rodolfo J.C. Cantet

Keyword(s):

Spatial Variability ◽

Large Scale ◽

Mixed Model ◽

Deviance Information Criterion ◽

Covariance Structure ◽

Information Criterion ◽

Penalized Splines ◽

Dispersion Parameters ◽

Individual Tree ◽

Breeding Values

Unaccounted for spatial variability leads to bias in estimating genetic parameters and predicting breeding values from forest genetic trials. Previous attempts to account for large-scale continuous spatial variation employed spatial coordinates in the direction of the rows (or columns). In this research, we use an individual-tree mixed model and the tensor product of B-spline bases with a proper covariance structure for the random knot effects to account for spatial variability. Dispersion parameters were estimated using Bayesian techniques via Gibbs sampling. The procedure is illustrated with data from a progeny trial of Eucalyptus globulus subsp. globulus Labill. Four different models were used in the sequel. The first model included block effects and the three other models included a surface on a grid of either 8 × 8, 12 × 12, or 18 × 18 knots. The three models with B-splines displayed a sizeable lower value of the deviance information criterion than the model with blocks. Also, the mixed models fitting a surface displayed a consistent reduction in the posterior mean of σ2e, an increase in the posterior means of σ2A and h2DBH, and an increase of 66% (for parents) or 60% (for offspring) in the accuracy of breeding values.

Download Full-text

A Shortcut Approach for Large-scale Mixed Model Associations with Binary Traits

10.21203/rs.3.rs-312421/v2 ◽

2021 ◽

Author(s):

Runqing Yang ◽

Jun Bao ◽

Runqing Yang ◽

Yuxin Song ◽

Zhiyu Hao ◽

...

Keyword(s):

Quantitative Trait ◽

Multiple Testing ◽

Statistical Power ◽

Large Scale ◽

Mixed Model ◽

Joint Analysis ◽

Genomic Breeding ◽

Breeding Values ◽

Genomic Control ◽

Genomic Heritability

Abstract Generalized linear mixed models exhibit computationally intensive and biasness in mapping quantitative trait nucleotides for binary diseases. In genomic logit regression, we consider genomic breeding values estimated in advance as a known predictor, and then correct the deflated association test statistics by using genomic control, thereby successfully extending GRAMMAR-Lambda to analyze binary diseases in a complex structured population. Because there is no need to estimate genomic heritability and genomic breeding values can be estimated by a small number of sampling markers, the generalized mixed-model association analysis has been extremely simplified to handle large-scale data. With almost perfect genomic control, joint analysis for the candidate quantitative trait nucleotides chosen by multiple testing offered a significant improvement in statistical power.

Download Full-text

MegaLMM: Mega-scale linear mixed models for genomic predictions with thousands of traits

Genome Biology ◽

10.1186/s13059-021-02416-w ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Daniel E. Runcie ◽

Jiayi Qu ◽

Hao Cheng ◽

Lorin Crawford

Keyword(s):

Genomic Prediction ◽

Large Scale ◽

Mixed Model ◽

Human Genetics ◽

Linear Mixed Effect Model ◽

Mixed Effect ◽

Statistical Framework ◽

Effect Model ◽

Plant Data ◽

Genetic Value

AbstractLarge-scale phenotype data can enhance the power of genomic prediction in plant and animal breeding, as well as human genetics. However, the statistical foundation of multi-trait genomic prediction is based on the multivariate linear mixed effect model, a tool notorious for its fragility when applied to more than a handful of traits. We present , a statistical framework and associated software package for mixed model analyses of a virtually unlimited number of traits. Using three examples with real plant data, we show that can leverage thousands of traits at once to significantly improve genetic value prediction accuracy.

Download Full-text

Exome-wide association studies in general and long-lived populations identify genetic variants related to human age

10.1101/2020.07.19.188789 ◽

2020 ◽

Author(s):

Patrick Sin-Chan ◽

Nehal Gosalia ◽

Chuan Gao ◽

Cristopher V. Van Hout ◽

Bin Ye ◽

...

Keyword(s):

Exome Sequencing ◽

Large Scale ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Model Systems ◽

P Value ◽

Ashkenazi Jews ◽

Association Analyses ◽

Age Related

SUMMARYAging is characterized by degeneration in cellular and organismal functions leading to increased disease susceptibility and death. Although our understanding of aging biology in model systems has increased dramatically, large-scale sequencing studies to understand human aging are now just beginning. We applied exome sequencing and association analyses (ExWAS) to identify age-related variants on 58,470 participants of the DiscovEHR cohort. Linear Mixed Model regression analyses of age at last encounter revealed variants in genes known to be linked with clonal hematopoiesis of indeterminate potential, which are associated with myelodysplastic syndromes, as top signals in our analysis, suggestive of age-related somatic mutation accumulation in hematopoietic cells despite patients lacking clinical diagnoses. In addition to APOE, we identified rare DISP2 rs183775254 (p = 7.40×10−10) and ZYG11A rs74227999 (p = 2.50×10−08) variants that were negatively associated with age in either both sexes combined and females, respectively, which were replicated with directional consistency in two independent cohorts. Epigenetic mapping showed these variants are located within cell-type-specific enhancers, suggestive of important transcriptional regulatory functions. To discover variants associated with extreme age, we performed exome-sequencing on persons of Ashkenazi Jewish descent ascertained for extensive lifespans. Case-Control analyses in 525 Ashkenazi Jews cases (Males ≥ 92 years, Females ≥ 95years) were compared to 482 controls. Our results showed variants in APOE (rs429358, rs6857), and TMTC2 (rs7976168) passed Bonferroni-adjusted p-value, as well as several nominally-associated population-specific variants. Collectively, our Age-ExWAS, the largest performed to date, confirmed and identified previously unreported candidate variants associated with human age.

Download Full-text

297 GWAS for complex models accounting for populations structure with GBLUP and ssGBLUP

Journal of Animal Science ◽

10.1093/jas/skaa278.057 ◽

2020 ◽

Vol 98 (Supplement_4) ◽

pp. 32-32

Author(s):

Juan P Steibel ◽

Ignacio Aguilar

Keyword(s):

Hypothesis Testing ◽

Large Scale ◽

Mixed Model ◽

Prediction Models ◽

Association Studies ◽

Least Square ◽

Type I ◽

Phenotypic Variance ◽

Genome Wide Association Studies ◽

Formal Hypothesis Testing

Abstract Genomic Best Linear Unbiased Prediction (GBLUP) is the method of choice for incorporating genomic information into the genetic evaluation of livestock species. Furthermore, single step GBLUP (ssGBLUP) is adopted by many breeders’ associations and private entities managing large scale breeding programs. While prediction of breeding values remains the primary use of genomic markers in animal breeding, a secondary interest focuses on performing genome-wide association studies (GWAS). The goal of GWAS is to uncover genomic regions that harbor variants that explain a large proportion of the phenotypic variance, and thus become candidates for discovering and studying causative variants. Several methods have been proposed and successfully applied for embedding GWAS into genomic prediction models. Most methods commonly avoid formal hypothesis testing and resort to estimation of SNP effects, relying on visual inspection of graphical outputs to determine candidate regions. However, with the advent of high throughput phenomics and transcriptomics, a more formal testing approach with automatic discovery thresholds is more appealing. In this work we present the methodological details of a method for performing formal hypothesis testing for GWAS in GBLUP models. First, we present the method and its equivalencies and differences with other GWAS methods. Moreover, we demonstrate through simulation analyses that the proposed method controls type I error rate at the nominal level. Second, we demonstrate two possible computational implementations based on mixed model equations for ssGBLUP and based on the generalized least square equations (GLS). We show that ssGBLUP can deal with datasets with extremely large number of animals and markers and with multiple traits. GLS implementations are well suited for dealing with smaller number of animals with tens of thousands of phenotypes. Third, we show several useful extensions, such as: testing multiple markers at once, testing pleiotropic effects and testing association of social genetic effects.

Download Full-text

Reduced Animal Models Fitting Only Equations for Phenotyped Animals

Frontiers in Genetics ◽

10.3389/fgene.2021.637626 ◽

2021 ◽

Vol 12 ◽

Author(s):

Mohammad Ali Nilforooshan ◽

Dorian Garrick

Keyword(s):

Animal Model ◽

Animal Models ◽

Mixed Model ◽

Reduced Model ◽

Relationship Matrix ◽

Full Model ◽

Breeding Values ◽

Reduced Animal Model ◽

Numerator Relationship Matrix ◽

Model Equations

Reduced models are equivalent models to the full model that enable reduction in the computational demand for solving the problem, here, mixed model equations for estimating breeding values of selection candidates. Since phenotyped animals provide data to the model, the aim of this study was to reduce animal models to those equations corresponding to phenotyped animals. Non-phenotyped ancestral animals have normally been included in analyses as they facilitate formation of the inverse numerator relationship matrix. However, a reduced model can exclude those animals and obtain identical solutions for the breeding values of the animals of interest. Solutions corresponding to non-phenotyped animals can be back-solved from the solutions of phenotyped animals and specific blocks of the inverted relationship matrix. This idea was extended to other forms of animal model and the results from each reduced model (and back-solving) were identical to the results from the corresponding full model. Previous studies have been mainly focused on reduced animal models that absorb equations corresponding to non-parents and solve equations only for parents of phenotyped animals. These two types of reduced animal model can be combined to formulate only equations corresponding to phenotyped parents of phenotyped progeny.

Download Full-text

Associations of Breeding Values for Disease Traits and Genetic Markers in Dairy Cattle Estimated with a Mixed Model

Journal of Dairy Science ◽

10.3168/jds.s0022-0302(93)77722-x ◽

1993 ◽

Vol 76 (12) ◽

pp. 3785-3791 ◽

Cited By ~ 3

Author(s):

Lena Andersson-Eklund ◽

Birgitta Danell

Keyword(s):

Dairy Cattle ◽

Genetic Markers ◽

Mixed Model ◽

Breeding Values

Download Full-text

Scheduling Just-in-Time Transport Vehicles to Feed Parts for Mixed Model Assembly Lines

Discrete Dynamics in Nature and Society ◽

10.1155/2020/2939272 ◽

2020 ◽

Vol 2020 ◽

pp. 1-13 ◽

Cited By ~ 1

Author(s):

Yunfang Peng ◽

Tian Zeng ◽

Yajuan Han ◽

Beixin Xia

Keyword(s):

Large Scale ◽

Assembly Line ◽

Mixed Model ◽

Programming Model ◽

Primary Objective ◽

Mixed Integer ◽

Inventory Level ◽

Just In Time ◽

Time Model ◽

Delivery Strategy

In order to solve the problem of vehicle scheduling to feed parts at automobile assembly line, this study proposes a just-in-time delivery method combined with the mode of material supermarket. A mixed integer linear programming model with the primary objective of using the least number of tow trains is constructed by considering capacity of vehicle and inventory levels of line. On the basis of the minimum number of tow trains, the schedule of each tour is reasonably planned to minimize inventory of assembly line, which is the secondary objective of the part supply problem. Additionally, a heuristic algorithm which can obtain a satisfactory solution in a short time is designed to solve large-scale problems after considering continuity and complexity of modern automobile production. Furthermore, some cases are analyzed and compared with the widely used periodic delivery strategy, and the feasibility of just-in-time model and algorithm is verified. The results reveal that just-in-time delivery strategy has more advantages in reducing inventory level than periodic delivery strategy.

Download Full-text

Phenotypic Characterization of Milk Yield and Quality Traits in a Large Population of Water Buffaloes

Animals ◽

10.3390/ani10020327 ◽

2020 ◽

Vol 10 (2) ◽

pp. 327 ◽

Cited By ~ 2

Author(s):

Angela Costa ◽

Riccardo Negrini ◽

Massimo De Marchi ◽

Giuseppe Campanile ◽

Gianluca Neglia

Keyword(s):

Milk Yield ◽

Milk Fat ◽

Large Scale ◽

Mixed Model ◽

Large Population ◽

Solid Content ◽

Current Status ◽

Phenotypic Characterization ◽

Italian Population ◽

Milk Traits

The buffalo milk industry has economic and social relevance in Italy, as linked to the manufacture of traditional dairy products. To provide an overview of the current status of buffaloes’ performances on a large scale, almost 1 million milk test-day records from 72,294 buffaloes were available to investigate milk yield, energy corrected milk, fat, protein, and lactose content, and somatic cell score (SCS). Phenotypic correlations between milk traits were calculated and analysis of variance was carried out through a mixed model approach including fixed effect of parity, stage of lactation, sampling time, month of calving, and all their interactions and random effects of buffalo, herd-test-date, and residual. Third-parity buffaloes were the most productive in terms of milk yield, while the lowest solid content was detected in sixth parity buffaloes. A considerable gap between primiparous and multiparous buffaloes was observed for milk yield, especially in early- and mid-lactation. Overall, SCS progressively increased with parity and showed a negative correlation with milk yield in both primiparous (−0.12) and multiparous (−0.14) buffaloes. Results suggested that, at the industrial level, milk of primiparous buffaloes may be preferred for transformation purposes, since it was characterized by greater solid content and lower SCS. Results of this study provide a picture of the Italian population of buffaloes under systematic performance records and might be beneficial to both dairy industry and breeding organizations.

Download Full-text