Selection of locations for safflower cultivar trials on the Canadian prairies by using the AMMI procedure

1995 ◽  
Vol 75 (4) ◽  
pp. 767-774
Author(s):  
H.-H. Mündel ◽  
T. Entz ◽  
J. P. Braun ◽  
F. A. Kiehn

Additive main effects and multiplicative interaction (AMMI) analysis of Safflower Cooperative Registration Test (SCRT) data gathered from 1984 to 1991 across the Canadian prairies was used to assess the possibility of reducing the number of locations for cultivar evaluation. The cultivars Saffire, Hartman, S-208, and S-541 were included in the 1984–1986 data set; and Saffire, AC Stirling, S-208, and S-541 in the 1988–1991 set. Seed yield, percent oil, days to maturity, and test weight were measured at 12 locations, although due to weather conditions, data were sometimes not available for all locations in any given year. The AMMI model fit the data well for all four traits, and indicated that among-year variability at a given location was usually higher than inter-location variability in a given year. Cultivar interaction effects for all four characteristics assessed were usually large for both data sets, indicating that differences among cultivars at a given location can vary considerably over years. Intra-location variability was not consistent for the four traits and no clear grouping of locations or locations with cultivars over years was evident. These results suggest that local environmental factors significantly influence safflower traits, and potential cultivars need to be evaluated at as many locations as resources permit. Key words:Carthamus tinctorius, cultivar × environment interactions, yield, oil, maturity, test weight

2007 ◽  
Vol 37 (3) ◽  
pp. 656-661 ◽  
Author(s):  
Velci Queiroz de Souza ◽  
Arione da Silva Pereira ◽  
Giovani Olegário da Silva ◽  
Roberto Fritsche Neto ◽  
Antônio Costa de Oliveira

The objective of this research was to compare the consistency of the bi-segmented and AMMI (additive main effects and multiplicative interaction analysis) methods for estimating yield stability in potatoes. Data of ten genotypes evaluated in 34 environments (local, growing season and year combinations) of the Rio Grande do Sul state, Brazil, in 1994 and 1995 were used. Three data sets were analyzed: 34-environment data set and two 17-environment data subsets, which were chosen by randomly dividing the total data set. For the 34-environment data set, the models gave similar results in relation to the stable genotypes, but they differed with regard to the unstable genotypes. For the 17-environment data sets, the bi-segmented model showed more consistent results, either between subsets or between these and the total data set. For the AMMI model, only the Santo Amor genotype showed consistency between one of the subsets and the total data set. In this work, the bi-segmented method was shown to be more consistent than the AMMI model.


Author(s):  
Antonia J. Jones ◽  
Dafydd Evans ◽  
Steve Margetts ◽  
Peter J. Durrant

The Gamma Test is a non-linear modelling analysis tool that allows us to quantify the extent to which a numerical input/output data set can be expressed as a smooth relationship. In essence, it allows us to efficiently calculate that part of the variance of the output that cannot be accounted for by the existence of any smooth model based on the inputs, even though this model is unknown. A key aspect of this tool is its speed: the Gamma Test has time complexity O(Mlog M), where M is the number of datapoints. For data sets consisting of a few thousand points and a reasonable number of attributes, a single run of the Gamma Test typically takes a few seconds. In this chapter we will show how the Gamma Test can be used in the construction of predictive models and classifiers for numerical data. In doing so, we will demonstrate the use of this technique for feature selection, and for the selection of embedding dimension when dealing with a time-series.


1988 ◽  
Vol 254 (1) ◽  
pp. E104-E112
Author(s):  
B. Candas ◽  
J. Lalonde ◽  
M. Normand

The aim of this study is the selection of the number of compartments required for a model to represent the distribution and metabolism of corticotropin-releasing factor (CRF) in rats. The dynamics of labeled rat CRF were measured in plasma for seven rats after a rapid injection. The sampling schedule resulted from the combination of the two D-optimal sampling sets of times corresponding to both rival models. This protocol improved the numerical identifiability of the parameters and consequently facilitated the selection of the relevant model. A three-compartment model fits adequately to the seven individual dynamics and better represents four of them compared with the lower-order model. It was demonstrated, using simulations in which the measurement errors and the interindividual variability of the parameters are included, that his four-to-seven ratio of data sets is consistent with the relevance of the three-compartment model for every individual kinetic data set. Kinetic and metabolic parameters were then derived for each individual rat, their values being consistent with the prolonged effects of CRF on pituitary-adrenocortical secretion.


2019 ◽  
Author(s):  
Hugh G. Gauch ◽  
David R. Moran

ABSTRACTThe Additive Main effects and Multiplicative Interaction (AMMI) model has been used extensively for analysis of multi-environment yield trials for two main purposes: understanding complex genotype-by-environment interactions and increasing accuracy. A 2013 paper in Crop Science presented a protocol for AMMI analysis with best practices, which has four steps: (i) analysis of variance, (ii) model diagnosis, (iii) mega-environment delineation, and (iv) agricultural recommendations. This preprint announces free open-source software, called AMMISOFT, which makes it easy to implement this protocol and thereby to accelerate crop improvement.


2021 ◽  
Vol 79 (1) ◽  
Author(s):  
Romana Haneef ◽  
Sofiane Kab ◽  
Rok Hrzic ◽  
Sonsoles Fuentes ◽  
Sandrine Fosse-Edorh ◽  
...  

Abstract Background The use of machine learning techniques is increasing in healthcare which allows to estimate and predict health outcomes from large administrative data sets more efficiently. The main objective of this study was to develop a generic machine learning (ML) algorithm to estimate the incidence of diabetes based on the number of reimbursements over the last 2 years. Methods We selected a final data set from a population-based epidemiological cohort (i.e., CONSTANCES) linked with French National Health Database (i.e., SNDS). To develop this algorithm, we adopted a supervised ML approach. Following steps were performed: i. selection of final data set, ii. target definition, iii. Coding variables for a given window of time, iv. split final data into training and test data sets, v. variables selection, vi. training model, vii. Validation of model with test data set and viii. Selection of the model. We used the area under the receiver operating characteristic curve (AUC) to select the best algorithm. Results The final data set used to develop the algorithm included 44,659 participants from CONSTANCES. Out of 3468 variables from SNDS linked to CONSTANCES cohort were coded, 23 variables were selected to train different algorithms. The final algorithm to estimate the incidence of diabetes was a Linear Discriminant Analysis model based on number of reimbursements of selected variables related to biological tests, drugs, medical acts and hospitalization without a procedure over the last 2 years. This algorithm has a sensitivity of 62%, a specificity of 67% and an accuracy of 67% [95% CI: 0.66–0.68]. Conclusions Supervised ML is an innovative tool for the development of new methods to exploit large health administrative databases. In context of InfAct project, we have developed and applied the first time a generic ML-algorithm to estimate the incidence of diabetes for public health surveillance. The ML-algorithm we have developed, has a moderate performance. The next step is to apply this algorithm on SNDS to estimate the incidence of type 2 diabetes cases. More research is needed to apply various MLTs to estimate the incidence of various health conditions.


2019 ◽  
Vol 5 (10) ◽  
pp. 2120-2130 ◽  
Author(s):  
Suraj Kumar ◽  
Thendiyath Roshni ◽  
Dar Himayoun

Reliable method of rainfall-runoff modeling is a prerequisite for proper management and mitigation of extreme events such as floods. The objective of this paper is to contrasts the hydrological execution of Emotional Neural Network (ENN) and Artificial Neural Network (ANN) for modelling rainfall-runoff in the Sone Command, Bihar as this area experiences flood due to heavy rainfall. ENN is a modified version of ANN as it includes neural parameters which enhance the network learning process. Selection of inputs is a crucial task for rainfall-runoff model. This paper utilizes cross correlation analysis for the selection of potential predictors. Three sets of input data: Set 1, Set 2 and Set 3 have been prepared using weather and discharge data of 2 raingauge stations and 1 discharge station located in the command for the period 1986-2014.  Principal Component Analysis (PCA) has then been performed on the selected data sets for selection of data sets showing principal tendencies.  The data sets obtained after PCA have then been used in the model development of ENN and ANN models. Performance indices were performed for the developed model for three data sets. The results obtained from Set 2 showed that ENN with R= 0.933, R2 = 0.870, Nash Sutcliffe = 0.8689, RMSE = 276.1359 and Relative Peak Error = 0.00879 outperforms ANN in simulating the discharge. Therefore, ENN model is suggested as a better model for rainfall-runoff discharge in the Sone command, Bihar.


Author(s):  
Hemant Kumar ◽  
G.P. Dixit ◽  
N.P. Singh ◽  
A.K. Srivastava

Multi-environmental trials have generally significant genotype main effects and genotype x environment interaction (GEI) effect and, therefore different univariate and multivariate stability methods have been used to study the GEI. Among the multivariate methods, the additive main effects and multiplicative interaction (AMMI) analysis is widely used for GEI investigation. This method has been effective because it captures a large portion of the GEI sum of squares; it clearly separates main and interaction effects and often provides meaningful interpretation of data to support a breeding program such as genotypic stability. Based on the AMMI model, a stability index has been used to rank the genotypes. This index is the weightage of stability and yield component and higher the index value better is the genotypes. The index of 40 promising chickpea genotypes were calculated with two different weight of yield (50% and 75%) and stability component (50% and 25%). These genotypes were evaluated at seven locations viz. Hiriyur, Nandyal, Coimbtore, Dharwad, Lam, Bijapur and Gulbarga representing the south zone of All India Coordinated Research Project on Chickpea program during 2015-16. Ranking of genotypes are done based on two different weight of stability and yield component.


2012 ◽  
Vol 52 (No. 4) ◽  
pp. 188-196 ◽  
Author(s):  
Y. Lei ◽  
S. Y Zhang

Forestmodellers have long faced the problem of selecting an appropriate mathematical model to describe tree ontogenetic or size-shape empirical relationships for tree species. A common practice is to develop many models (or a model pool) that include different functional forms, and then to select the most appropriate one for a given data set. However, this process may impose subjective restrictions on the functional form. In this process, little attention is paid to the features (e.g. asymptote and inflection point rather than asymptote and nonasymptote) of different functional forms, and to the intrinsic curve of a given data set. In order to find a better way of comparing and selecting the growth models, this paper describes and analyses the characteristics of the Schnute model. This model has both flexibility and versatility that have not been used in forestry. In this study, the Schnute model was applied to different data sets of selected forest species to determine their functional forms. The results indicate that the model shows some desirable properties for the examined data sets, and allows for discerning the different intrinsic curve shapes such as sigmoid, concave and other curve shapes. Since no suitable functional form for a given data set is usually known prior to the comparison of candidate models, it is recommended that the Schnute model be used as the first step to determine an appropriate functional form of the data set under investigation in order to avoid using a functional form a priori.


2001 ◽  
Vol 57 (4) ◽  
pp. 497-506 ◽  
Author(s):  
A. T. H. Lenstra ◽  
O. N. Kataeva

The crystal structures of the title compounds were determined with net intensities I derived via the background–peak–background procedure. Least-squares optimizations reveal differences between the low-order (0 < s < 0.7 Å−1) and high-order (0.7 < s < 1.0 Å−1) structure models. The scale factors indicate discrepancies of up to 10% between the low-order and high-order reflection intensities. This observation is compound independent. It reflects the scan-angle-induced truncation error, because the applied scan angle (0.8 + 2.0 tan θ)° underestimates the wavelength dispersion in the monochromated X-ray beam. The observed crystal structures show pseudo-I-centred sublattices for three of its non-H atoms in the asymmetric unit. Our selection of observed intensities (I > 3σ) stresses that pseudo-symmetry. Model refinements on individual data sets with (h + k + l) = 2n and (h + k + l) = 2n + 1 illustrate the lack of model robustness caused by that pseudo-symmetry. To obtain a better balanced data set and thus a more robust structure we decided to exploit background modelling. We described the background intensities B(\displaystyle\mathrel{\mathop H^{\rightharpoonup}}) with an 11th degree polynomial in θ. This function predicts the local background b at each position \displaystyle\mathrel{\mathop H^{\rightharpoonup}} and defines the counting statistical distribution P(B), in which b serves as average and variance. The observation R defines P(R). This leads to P(I) = P(R)/P(B) and thus I = R − b and σ2(I) = I so that the error σ(I) is background independent. Within this framework we reanalysed the structure of the copper(II) derivative. Background modelling resulted in a structure model with an improved internal consistency. At the same time the unweighted R value based on all observations decreased from 10.6 to 8.4%. A redetermination of the structure at 120 K concluded the analysis.


2015 ◽  
Vol 2015 ◽  
pp. 1-10 ◽  
Author(s):  
J. Zyprych-Walczak ◽  
A. Szabelska ◽  
L. Handschuh ◽  
K. Górczak ◽  
K. Klamecka ◽  
...  

High-throughput sequencing technologies, such as the Illumina Hi-seq, are powerful new tools for investigating a wide range of biological and medical problems. Massive and complex data sets produced by the sequencers create a need for development of statistical and computational methods that can tackle the analysis and management of data. The data normalization is one of the most crucial steps of data processing and this process must be carefully considered as it has a profound effect on the results of the analysis. In this work, we focus on a comprehensive comparison of five normalization methods related to sequencing depth, widely used for transcriptome sequencing (RNA-seq) data, and their impact on the results of gene expression analysis. Based on this study, we suggest a universal workflow that can be applied for the selection of the optimal normalization procedure for any particular data set. The described workflow includes calculation of the bias and variance values for the control genes, sensitivity and specificity of the methods, and classification errors as well as generation of the diagnostic plots. Combining the above information facilitates the selection of the most appropriate normalization method for the studied data sets and determines which methods can be used interchangeably.


Sign in / Sign up

Export Citation Format

Share Document