A RANGE CRITERION FOR TESTING AN OUTLYING OBSERVATION

Estimation of the mean of the exponential distribution with an outlying observation

Communication in Statistics- Theory and Methods ◽

10.1080/03610928208828320 ◽

1982 ◽

Vol 11 (13) ◽

pp. 1439-1452 ◽

Cited By ~ 13

Author(s):

Burkhard O. Rauhut

Keyword(s):

Exponential Distribution ◽

The Mean ◽

Outlying Observation

Download Full-text

How to decide objectively whether an outlying observation should be rejected

10.6028/nbs.rpt.1626 ◽

1952 ◽

Author(s):

Frank Proschan

Keyword(s):

Outlying Observation

Download Full-text

Robust and efficient identification of biomarkers from RNA-Seq data using median control chart

F1000Research ◽

10.12688/f1000research.17351.1 ◽

2019 ◽

Vol 8 ◽

pp. 7

Author(s):

Md Shahjaman ◽

Habiba Akter ◽

Md. Mamunur Rashid ◽

Md. Ibnul Asifuzzaman ◽

Md. Bipul Hossen ◽

...

Keyword(s):

Control Chart ◽

Negative Binomial ◽

Rna Seq ◽

Traditional Methods ◽

Experimental Conditions ◽

Biomarker Identification ◽

Next Generation Sequencing Technology ◽

Wet Lab ◽

Outlying Observation ◽

Generation Sequencing

Background: One of the main goals of RNA-seq data analysis is identification of biomarkers that are differentially expressed (DE) across two or more experimental conditions. RNA-seq uses next generation sequencing technology and it has many advantages over microarrays. Numerous statistical methods have already been developed for identification the biomarkers from RNA-seq data. Most of these methods were based on either Poisson distribution or negative binomial distribution. However, efficient biomarker identification from discrete RNA-seq data is hampered by existing methods when the datasets contain outliers or extreme observations. Specially, the performance of these methods becomes more severe when the data come from a small number of samples in the presence of outliers. Therefore, in this study, an attempt is made to propose an outlier detection and modification approach for RNA-seq data to overcome the aforesaid problems of traditional methods. We make our proposed method facilitate in RNA-seq data by transforming the read count data into continuous data. Methods: We use median control chart to detect and modify the outlying observation in a log-transformed RNA-seq dataset. To investigate the performance of the proposed method in absence and presence of outliers, we employ the five popular biomarker selection methods (edgeR, edgeR_robust, DEseq, DEseq2 and limma) both in simulated and real datasets. Results: The simulation results strongly suggest that the performance of the proposed method improved in the presence of outliers. The proposed method also detected an additional 18 outlying DE genes from a real mouse RNA-seq dataset that were not detected by traditional methods. Using the KEGG pathway and gene ontology analysis results we reveal that these genes may be biomarkers, which require validation in a wet lab. Conclusions: Our proposal is to apply the proposed method for biomarker identification from other RNA-seq data.

Download Full-text

Bayesian residual analysis for spatially correlated data

Statistical Modelling ◽

10.1177/1471082x18811529 ◽

2019 ◽

Vol 20 (2) ◽

pp. 171-194 ◽

Cited By ~ 1

Author(s):

Viviana GR Lobo ◽

Thaís CO Fonseca

Keyword(s):

Spatial Data ◽

Density Ratio ◽

Spatial Models ◽

Residual Analysis ◽

Correlated Data ◽

Point Of View ◽

Spatially Correlated ◽

Wind Speeds ◽

Standardized Residuals ◽

Outlying Observation

This work considers residual analysis and predictive techniques for the identification of individual and multiple outliers in geostatistical data. The standardized Bayesian spatial residual is proposed and computed for three competing models: the Gaussian, Student-t and Gaussian-log-Gaussian spatial processes. In this context, the spatial models are investigated regarding their plausibility for datasets contaminated with outliers. The posterior probability of an outlying observation is computed based on the standardized residuals and different thresholds for outlier discrimination are tested. From a predictive point of view, methods such as the conditional predictive ordinate, the predictive concordance and the Savage–Dickey density ratio for hypothesis testing are investigated for identification of outliers in the spatial setting. For illustration, contaminated datasets are considered to assess the performance of the three spatial models for identification of outliers in spatial data. Furthermore, an application to wind speed modelling is presented to illustrate the usefulness of the proposed tools to detect regions with large wind speeds.

Download Full-text

Outlying Observation Diagnostics in Growth Curve Modeling

Multivariate Behavioral Research ◽

10.1080/00273171.2017.1374824 ◽

2017 ◽

Vol 52 (6) ◽

pp. 768-788 ◽

Cited By ~ 3

Author(s):

Xin Tong ◽

Zhiyong Zhang

Keyword(s):

Growth Curve ◽

Growth Curve Modeling ◽

Curve Modeling ◽

Outlying Observation

Download Full-text

Robust AIC with High Breakdown Scale Estimate

Journal of Applied Mathematics ◽

10.1155/2014/286414 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 3

Author(s):

Shokrya Saleh

Keyword(s):

Influence Function ◽

Real Data ◽

Information Criterion ◽

Breakdown Point ◽

Point Estimate ◽

Leverage Points ◽

High Breakdown Point ◽

Outlying Observation ◽

Scale Estimate ◽

Squared Residuals

Akaike Information Criterion (AIC) based on least squares (LS) regression minimizes the sum of the squared residuals; LS is sensitive to outlier observations. Alternative criterion, which is less sensitive to outlying observation, has been proposed; examples are robust AIC (RAIC), robust Mallows Cp (RCp), and robust Bayesian information criterion (RBIC). In this paper, we propose a robust AIC by replacing the scale estimate with a high breakdown point estimate of scale. The robustness of the proposed methods is studied through its influence function. We show that, the proposed robust AIC is effective in selecting accurate models in the presence of outliers and high leverage points, through simulated and real data examples.

Download Full-text