The Normal Distribution And Data Transformation

Log transformation of proficiency testing data on the content of genetically modified organisms in food and feed samples: is it justified?

Analytical and Bioanalytical Chemistry ◽

10.1007/s00216-019-02338-4 ◽

2019 ◽

Vol 412 (5) ◽

pp. 1129-1136 ◽

Cited By ~ 3

Author(s):

Wim Broothaerts ◽

Fernando Cordeiro ◽

Philippe Corbisier ◽

Piotr Robouch ◽

Hendrik Emons

Keyword(s):

Normal Distribution ◽

Proficiency Testing ◽

Genetically Modified Organisms ◽

Genetically Modified ◽

Data Transformation ◽

Evaluation Procedure ◽

Reference Laboratory ◽

Log Data ◽

Log Transformation ◽

Food And Feed

AbstractThe outcome of proficiency tests (PTs) is influenced, among others, by the evaluation procedure chosen by the PT provider. In particular for PTs on GMO testing a log-data transformation is often applied to fit skewed data distributions into a normal distribution. The study presented here has challenged this commonly applied approach. The 56 data populations from proficiency testing rounds organised since 2010 by the European Union Reference Laboratory for Genetically Modified Food and Feed (EURL GMFF) were used to investigate the assumption of a normal distribution of reported results within a PT. Statistical evaluation of the data distributions, composed of 3178 reported results, revealed that 41 of the 56 datasets showed indeed a normal distribution. For 10 datasets, the deviation from normality was not statistically significant at the raw or log scale, indicating that the normality assumption cannot be rejected. The normality of the five remaining datasets was statistically significant after log-data transformation. These datasets, however, appeared to be multimodal as a result of technical/experimental issues with the applied methods. On the basis of the real datasets analysed herein, it is concluded that the log transformation of reported data in proficiency testing rounds is often not necessary and should be cautiously applied. It is further shown that the log-data transformation, when applied to PT results, favours the positive performance scoring for overestimated results and strongly penalises underestimated results. The evaluation of the participants’ performance without prior transformation of their results may highlight rather than hide relevant underlying analytical problems and is recommended as an outcome of this study.

Download Full-text

A Quantitative Validation Method of Kriging Metamodel for Injection Mechanism Based on Bayesian Statistical Inference

Metals ◽

10.3390/met9050493 ◽

2019 ◽

Vol 9 (5) ◽

pp. 493 ◽

Cited By ~ 1

Author(s):

Dongdong You ◽

Xiaocheng Shen ◽

Yanghui Zhu ◽

Jianxin Deng ◽

Fenglei Li

Keyword(s):

Normal Distribution ◽

Hypothesis Test ◽

Data Transformation ◽

Kriging Model ◽

Data Uncertainty ◽

Small Range ◽

Test Results ◽

Kriging Metamodel ◽

Injection Mechanism ◽

Quantitative Validation

A Bayesian framework-based approach is proposed for the quantitative validation and calibration of the kriging metamodel established by simulation and experimental training samples of the injection mechanism in squeeze casting. The temperature data uncertainty and non-normal distribution are considered in the approach. The normality of the sample data is tested by the Anderson–Darling method. The test results show that the original difference data require transformation for Bayesian testing due to the non-normal distribution. The Box–Cox method is employed for the non-normal transformation. The hypothesis test results of the calibrated kriging model are more reliable after data transformation. The reliability of the kriging metamodel is quantitatively assessed by the calculated Bayes factor and confidence. The Bayesian factor and the confidence level results indicate that the kriging model demonstrates improved accuracy and is acceptable after data transformation. The influence of the threshold ε on both the non-normally and normally distributed data in the model is quantitatively evaluated. The threshold ε has a greater influence and higher sensitivity when applied to the normal data results, based on the rapid increase within a small range of the Bayes factors and confidence levels.

Download Full-text

Data transformation models utilized in Bayesian probabilistic forecast considering inflow forecasts

Hydrology Research ◽

10.2166/nh.2019.028 ◽

2019 ◽

Vol 50 (5) ◽

pp. 1267-1280

Author(s):

Wei Xu ◽

Xiaoying Fu ◽

Xia Li ◽

Ming Wang

Keyword(s):

Normal Distribution ◽

Forecast Accuracy ◽

Data Transformation ◽

Probabilistic Forecast ◽

Standard Normal Distribution ◽

Standard Normal ◽

Comparative Results ◽

Distribution Transformation ◽

Hydropower Reservoir ◽

Efficiency And Reliability

Abstract This paper presents a new Bayesian probabilistic forecast (BPF) model to improve the efficiency and reliability of normal distribution transformation and to describe the uncertainties of medium-range forecasting inflows with 10 days forecast horizons. In this model, the inflow data will be transformed twice to a standard normal distribution. The Box–Cox (BC) model is first used to quickly transform the inflow data with a normal distribution, and then, the transformed data are converted to a standard normal distribution by the meta-Gaussian (MG) model. Based on the transformed inflows in the standard normal distribution, the prior and likelihood density functions of the BPF are established, respectively. In this study, the newly developed model is tested on China's Huanren hydropower reservoir and is compared with BPFs using MG and BC, separately. Comparative results show that the new BPF model exhibits significantly improved data transformation efficiency and forecast accuracy.

Download Full-text

Appendix 2: Normal distribution probabilities

Flow Cytometry Data Analysis ◽

10.1017/cbo9780511600357.014 ◽

1992 ◽

pp. 250-251

Keyword(s):

Normal Distribution

Download Full-text

On the Mathematical Basis of Zelen’s Prerandomized Designs

Methods of Information in Medicine ◽

10.1055/s-0038-1635370 ◽

1985 ◽

Vol 24 (03) ◽

pp. 120-130 ◽

Cited By ~ 28

Author(s):

E. Brunner ◽

N. Neumann

Keyword(s):

Clinical Trial ◽

Normal Distribution ◽

Random Variables ◽

Small Samples ◽

Selection Effects ◽

Sample Sizes ◽

Mathematical Basis ◽

Additional Assumptions

SummaryThe mathematical basis of Zelen’s suggestion [4] of pre randomizing patients in a clinical trial and then asking them for their consent is investigated. The first problem is to estimate the therapy and selection effects. In the simple prerandomized design (PRD) this is possible without any problems. Similar observations have been made by Anbar [1] and McHugh [3]. However, for the double PRD additional assumptions are needed in order to render therapy and selection effects estimable. The second problem is to determine the distribution of the statistics. It has to be taken into consideration that the sample sizes are random variables in the PRDs. This is why the distribution of the statistics can only be determined asymptotically, even under the assumption of normal distribution. The behaviour of the statistics for small samples is investigated by means of simulations, where the statistics considered in the present paper are compared with the statistics suggested by Ihm [2]. It turns out that the statistics suggested in [2] may lead to anticonservative decisions, whereas the “canonical statistics” suggested by Zelen [4] and considered in the present paper keep the level quite well or may lead to slightly conservative decisions, if there are considerable selection effects.

Download Full-text

Interrelations of Thrombo-Embolic Diseases and Blood-Group Distribution

Thrombosis and Haemostasis ◽

10.1055/s-0038-1654999 ◽

1963 ◽

Vol 09 (02) ◽

pp. 472-474 ◽

Cited By ~ 12

Author(s):

W Dick ◽

W Schneider ◽

K Brockmüller ◽

W Mayer

Keyword(s):

Normal Distribution ◽

Blood Group ◽

Blood Groups ◽

The Other ◽

Other Hand ◽

Group Distribution

SummaryA comparison between the repartition of the blood groups in 461 patients suffering from thromboembolic disorders and the normal distribution has shown a statistically ascertained predominance of the group A1. On the other hand the blood groups 0 and A2 are distinctly less frequent than in the normal distribution.

Download Full-text

Advanced Anti-Jam Indoor Adaptive GNSS Signal Acquisition: Part 1, Normal Distribution � Theory

Proceedings of the 2017 International Technical Meeting of The Institute of Navigation ◽

10.33012/2017.14934 ◽

2017 ◽

Author(s):

Ilir F. Progri

Keyword(s):

Normal Distribution ◽

Distribution Theory ◽

Signal Acquisition

Download Full-text

Bankruptcy Model Construction and its Limitation in Input Data Quality

Journal of Business and Economics ◽

10.15341/jbe(2155-7950)/02.10.2019/003 ◽

2019 ◽

Vol 10 (2) ◽

pp. 117-125

Author(s):

Dana Kubíčková ◽

◽

Vladimír Nulíček ◽

Keyword(s):

Data Quality ◽

Normal Distribution ◽

Input Data ◽

Model Construction ◽

The Third ◽

Third Stage ◽

Discriminant Analyses ◽

One Year ◽

The University ◽

Multivariate Discriminant

The aim of the research project solved at the University of Finance and administration is to construct a new bankruptcy model. The intention is to use data of the firms that have to cease their activities due to bankruptcy. The most common method for bankruptcy model construction is multivariate discriminant analyses (MDA). It allows to derive the indicators most sensitive to the future companies’ failure as a parts of the bankruptcy model. One of the assumptions for using the MDA method and reassuring the reliable results is the normal distribution and independence of the input data. The results of verification of this assumption as the third stage of the project are presented in this article. We have revealed that this assumption is met only in a few selected indicators. Better results were achieved in the indicators in the set of prosperous companies and one year prior the failure. The selected indicators intended for the bankruptcy model construction thus cannot be considered as suitable for using the MDA method.

Download Full-text

Analytical Representation of the Density Function of Normal Distribution of Noise

Journal of Automation and Information Sciences ◽

10.1615/jautomatinfscien.v47.i8.30 ◽

2015 ◽

Vol 47 (8) ◽

pp. 24-40 ◽

Cited By ~ 3

Author(s):

Telman Abbas ogly Aliev ◽

Naila F. Musaeva ◽

Matanat Tair kyzy Suleymanova ◽

Bahruz Ismail ogly Gazizade

Keyword(s):

Normal Distribution ◽

Density Function ◽

Analytical Representation

Download Full-text

Technology for Calculating the Parameters of the Density Function of the Normal Distribution of the Useful Component in a Noisy Process

Journal of Automation and Information Sciences ◽

10.1615/jautomatinfscien.v48.i4.50 ◽

2016 ◽

Vol 48 (4) ◽

pp. 39-55 ◽

Cited By ~ 4

Author(s):

Telman Abbas ogly Aliev ◽

Naila Fuad kyzy Musaeva ◽

Matanat Tair kyzy Suleymanova ◽

Bahruz Ismail ogly Gazizade

Keyword(s):

Normal Distribution ◽

Density Function

Download Full-text