scholarly journals A RANGE CRITERION FOR TESTING AN OUTLYING OBSERVATION

1951 ◽  
Author(s):  
J. Moshman ◽  
G. J. Atta
Keyword(s):  
F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 7
Author(s):  
Md Shahjaman ◽  
Habiba Akter ◽  
Md. Mamunur Rashid ◽  
Md. Ibnul Asifuzzaman ◽  
Md. Bipul Hossen ◽  
...  

Background: One of the main goals of RNA-seq data analysis is identification of biomarkers that are differentially expressed (DE) across two or more experimental conditions. RNA-seq uses next generation sequencing technology and it has many advantages over microarrays. Numerous statistical methods have already been developed for identification the biomarkers from RNA-seq data. Most of these methods were based on either Poisson distribution or negative binomial distribution. However, efficient biomarker identification from discrete RNA-seq data is hampered by existing methods when the datasets contain outliers or extreme observations. Specially, the performance of these methods becomes more severe when the data come from a small number of samples in the presence of outliers. Therefore, in this study, an attempt is made to propose an outlier detection and modification approach for RNA-seq data to overcome the aforesaid problems of traditional methods. We make our proposed method facilitate in RNA-seq data by transforming the read count data into continuous data. Methods: We use median control chart to detect and modify the outlying observation in a log-transformed RNA-seq dataset. To investigate the performance of the proposed method in absence and presence of outliers, we employ the five popular biomarker selection methods (edgeR, edgeR_robust, DEseq, DEseq2 and limma) both in simulated and real datasets. Results: The simulation results strongly suggest that the performance of the proposed method improved in the presence of outliers. The proposed method also detected an additional 18 outlying DE genes from a real mouse RNA-seq dataset that were not detected by traditional methods. Using the KEGG pathway and gene ontology analysis results we reveal that these genes may be biomarkers, which require validation in a wet lab. Conclusions: Our proposal is to apply the proposed method for biomarker identification from other RNA-seq data.


2019 ◽  
Vol 20 (2) ◽  
pp. 171-194 ◽  
Author(s):  
Viviana GR Lobo ◽  
Thaís CO Fonseca

This work considers residual analysis and predictive techniques for the identification of individual and multiple outliers in geostatistical data. The standardized Bayesian spatial residual is proposed and computed for three competing models: the Gaussian, Student-t and Gaussian-log-Gaussian spatial processes. In this context, the spatial models are investigated regarding their plausibility for datasets contaminated with outliers. The posterior probability of an outlying observation is computed based on the standardized residuals and different thresholds for outlier discrimination are tested. From a predictive point of view, methods such as the conditional predictive ordinate, the predictive concordance and the Savage–Dickey density ratio for hypothesis testing are investigated for identification of outliers in the spatial setting. For illustration, contaminated datasets are considered to assess the performance of the three spatial models for identification of outliers in spatial data. Furthermore, an application to wind speed modelling is presented to illustrate the usefulness of the proposed tools to detect regions with large wind speeds.


2014 ◽  
Vol 2014 ◽  
pp. 1-7 ◽  
Author(s):  
Shokrya Saleh

Akaike Information Criterion (AIC) based on least squares (LS) regression minimizes the sum of the squared residuals; LS is sensitive to outlier observations. Alternative criterion, which is less sensitive to outlying observation, has been proposed; examples are robust AIC (RAIC), robust Mallows Cp (RCp), and robust Bayesian information criterion (RBIC). In this paper, we propose a robust AIC by replacing the scale estimate with a high breakdown point estimate of scale. The robustness of the proposed methods is studied through its influence function. We show that, the proposed robust AIC is effective in selecting accurate models in the presence of outliers and high leverage points, through simulated and real data examples.


Sign in / Sign up

Export Citation Format

Share Document