Confidence Interval Estimation of an ROC Curve: An Application of Generalized Half Normal and Weibull Distributions

Journal of Probability and Statistics ◽

10.1155/2015/934362 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

S. Balaswamy ◽

R. Vishnu Vardhan

Keyword(s):

Roc Analysis ◽

Interval Estimation ◽

Area Under The Curve ◽

Real Data ◽

Simulation Studies ◽

Recent Past ◽

Data Set ◽

Generalized Distributions ◽

Accuracy Measure ◽

Confidence Interval Estimation

In the recent past, the work in the area of ROC analysis gained attention in explaining the accuracy of a test and identification of the optimal threshold. Such types of ROC models are referred to as bidistributional ROC models, for example Binormal, Bi-Exponential, Bi-Logistic and so forth. However, in practical situations, we come across data which are skewed in nature with extended tails. Then to address this issue, the accuracy of a test is to be explained by involving the scale and shape parameters. Hence, the present paper focuses on proposing an ROC model which takes into account two generalized distributions which helps in explaining the accuracy of a test. Further, confidence intervals are constructed for the proposed curve; that is, coordinates of the curve (FPR, TPR) and accuracy measure, Area Under the Curve (AUC), which helps in explaining the variability of the curve and provides the sensitivity at a particular value of specificity and vice versa. The proposed methodology is supported by a real data set and simulation studies.

Download Full-text

Bootstrap Based Confidence Interval Estimation of \\[6pt]Quantiles for Current Status Data

International Journal of Statistics and Probability ◽

10.5539/ijsp.v10n5p38 ◽

2021 ◽

Vol 10 (5) ◽

pp. 38

Author(s):

Wei Chen ◽

Fengling Ren

Keyword(s):

Confidence Interval ◽

Interval Estimation ◽

Real Data ◽

Current Status Data ◽

Current Status ◽

Well Performance ◽

Data Set ◽

Bootstrap Approach ◽

Confidence Interval Estimation ◽

Status Data

In this paper, we proposed a bootstrap approach to construct the confidence interval of quantiles for current status data, which is computationally simple and efficient without estimating nuisance parameters. The reasonability of the proposed method is verified by the well performance presented in the extensive simulation study. We also analyzed a real data set as illustration.

Download Full-text

Bayesian Inference for the Difference of Two Proportion Parameters in Over-Reported Two-Sample Binomial Data Using the Doubly Sample

Stats ◽

10.3390/stats2010009 ◽

2019 ◽

Vol 2 (1) ◽

pp. 111-120 ◽

Cited By ~ 1

Author(s):

Dewi Rahardja

Keyword(s):

Bayesian Inference ◽

Closed Form ◽

Bayesian Approach ◽

Interval Estimation ◽

Real Data ◽

Simulation Studies ◽

Posterior Distributions ◽

Binomial Data ◽

The Difference ◽

Data Subject

We construct a point and interval estimation using a Bayesian approach for the difference of two population proportion parameters based on two independent samples of binomial data subject to one type of misclassification. Specifically, we derive an easy-to-implement closed-form algorithm for drawing from the posterior distributions. For illustration, we applied our algorithm to a real data example. Finally, we conduct simulation studies to demonstrate the efficiency of our algorithm for Bayesian inference.

Download Full-text

Robustness of Projective IRT to Misspecification of the Underlying Multidimensional Model

Applied Psychological Measurement ◽

10.1177/0146621620909894 ◽

2020 ◽

Vol 44 (5) ◽

pp. 362-375

Author(s):

Tyler Strachan ◽

Edward Ip ◽

Yanyan Fu ◽

Terry Ackerman ◽

Shyh-Huei Chen ◽

...

Keyword(s):

Item Response Theory ◽

Item Response ◽

Real Data ◽

Model Parameters ◽

Simulation Studies ◽

Response Theory ◽

Computational Stability ◽

Data Set ◽

Response Data ◽

Higher Dimensional

As a method to derive a “purified” measure along a dimension of interest from response data that are potentially multidimensional in nature, the projective item response theory (PIRT) approach requires first fitting a multidimensional item response theory (MIRT) model to the data before projecting onto a dimension of interest. This study aims to explore how accurate the PIRT results are when the estimated MIRT model is misspecified. Specifically, we focus on using a (potentially misspecified) two-dimensional (2D)-MIRT for projection because of its advantages, including interpretability, identifiability, and computational stability, over higher dimensional models. Two large simulation studies (I and II) were conducted. Both studies examined whether the fitting of a 2D-MIRT is sufficient to recover the PIRT parameters when multiple nuisance dimensions exist in the test items, which were generated, respectively, under compensatory MIRT and bifactor models. Various factors were manipulated, including sample size, test length, latent factor correlation, and number of nuisance dimensions. The results from simulation studies I and II showed that the PIRT was overall robust to a misspecified 2D-MIRT. Smaller third and fourth simulation studies were done to evaluate recovery of the PIRT model parameters when the correctly specified higher dimensional MIRT or bifactor model was fitted with the response data. In addition, a real data set was used to illustrate the robustness of PIRT.

Download Full-text

A Novel Chen Extension: Theory, Characterizations and Different Estimation Methods

European Journal of Statistics ◽

10.28924/ada/stat.2.1 ◽

2021 ◽

Vol 2 ◽

pp. 1

Author(s):

Haitham M. Yousof ◽

Mustafa C. Korkmaz ◽

G.G. Hamedani ◽

Mohamed Ibrahim

Keyword(s):

Numerical Analysis ◽

Maximum Likelihood ◽

Weighted Least Squares ◽

Real Data ◽

Estimation Methods ◽

Simulation Studies ◽

Data Set ◽

Anderson Darling ◽

Mean Variance ◽

Skewness And Kurtosis

In this work, we derive a novel extension of Chen distribution. Some statistical properties of the new model are derived. Numerical analysis for mean, variance, skewness and kurtosis is presented. Some characterizations of the proposed distribution are presented. Different classical estimation methods under uncensored schemes such as the maximum likelihood, Anderson-Darling, weighted least squares and right-tail Anderson–Darling methods are considered. Simulation studies are performed in order to compare and assess the above-mentioned estimation methods. For comparing the applicability of the four classical methods, two application to real data set are analyzed.

Download Full-text

Topp–Leone Linear Exponential Distribution

Stochastics and Quality Control ◽

10.1515/eqc-2017-0022 ◽

2018 ◽

Vol 33 (1) ◽

pp. 31-43

Author(s):

Bol A. M. Atem ◽

Suleman Nasiru ◽

Kwara Nantomah

Keyword(s):

Maximum Likelihood ◽

Maximum Likelihood Estimation ◽

Exponential Distribution ◽

Likelihood Estimation ◽

Real Data ◽

Simulation Studies ◽

Finite Sample ◽

Data Set ◽

Finite Sample Properties ◽

Linear Exponential Distribution

Abstract This article studies the properties of the Topp–Leone linear exponential distribution. The parameters of the new model are estimated using maximum likelihood estimation, and simulation studies are performed to examine the finite sample properties of the parameters. An application of the model is demonstrated using a real data set. Finally, a bivariate extension of the model is proposed.

Download Full-text

The Beta Weibull-G Family of Distributions: Model, Properties and Application

International Journal of Statistics and Probability ◽

10.5539/ijsp.v7n2p12 ◽

2018 ◽

Vol 7 (2) ◽

pp. 12 ◽

Cited By ~ 2

Author(s):

Boikanyo Makubate ◽

Broderick O. Oluyede ◽

Gofaone Motobetso ◽

Shujiao Huang ◽

Adeniyi F. Fagbamigbe

Keyword(s):

Real Data ◽

Maximum Likelihood Estimates ◽

Logistic Distribution ◽

Model Parameters ◽

Exponential Distributions ◽

Data Set ◽

New Class ◽

New Family ◽

Generalized Distributions ◽

Special Cases

A new family of generalized distributions called the beta Weibull-G (BWG) distribution is proposed and developed. This new class of distributions has several new and well known distributions including exponentiated-G, Weibull-G, Rayleigh-G, exponential-G, beta exponential-G, beta Rayleigh-G, beta Rayleigh exponential, beta-exponential-exponential, Weibull-log-logistic distributions, as well as several other distributions such as beta Weibull-Uniform, beta Rayleigh-Uniform, beta exponential-Uniform, beta Weibull-log logistic and beta Weibull-exponential distributions as special cases. Series expansion of the density function, hazard function, moments, mean deviations, Lorenz and Bonferroni curves, R\'enyi entropy, distribution of order statistics and maximum likelihood estimates of the model parameters are given. Application of the model to real data set is presented to illustrate the importance and usefulness of the special case beta Weibull-log-logistic distribution.

Download Full-text

A new class of skew-logistic distribution

Mathematical Sciences ◽

10.1007/s40096-019-00306-8 ◽

2019 ◽

Vol 13 (4) ◽

pp. 375-385

Author(s):

Saeed Mirzadeh ◽

Anis Iranmanesh

Keyword(s):

Exponential Family ◽

Real Data ◽

Statistical Characteristics ◽

Logistic Distribution ◽

Simulation Studies ◽

Data Set ◽

The Real ◽

New Class ◽

Skewness Parameter

Abstract In this study, the researchers introduce a new class of the logistic distribution which can be used to model the unimodal data with some skewness present. The new generalization is carried out using the basic idea of Nadarajah (Statistics 48(4):872–895, 2014), called truncated-exponential skew-logistic (TESL) distribution. The TESL distribution is a member of the exponential family; therefore, the skewness parameter can be derived easier. Meanwhile, some important statistical characteristics are presented; the real data set and simulation studies are applied to evaluate the results. Also, the TESL distribution is compared to at least five other skew-logistic distributions.

Download Full-text

Regularized Estimation of the Four-Parameter Logistic Model

Psych ◽

10.3390/psych2040020 ◽

2020 ◽

Vol 2 (4) ◽

pp. 269-278

Author(s):

Michela Battauz

Keyword(s):

Logistic Model ◽

Likelihood Function ◽

Latent Trait ◽

Real Data ◽

Theory Model ◽

Simulation Studies ◽

Data Set ◽

Penalty Term ◽

Regularized Estimation ◽

Item Parameters

The four-parameter logistic model is an Item Response Theory model for dichotomous items that limit the probability of giving a positive response to an item into a restricted range, so that even people at the extremes of a latent trait do not have a probability close to zero or one. Despite the literature acknowledging the usefulness of this model in certain contexts, the difficulty of estimating the item parameters has limited its use in practice. In this paper we propose a regularized estimation approach for the estimation of the item parameters based on the inclusion of a penalty term in the log-likelihood function. Simulation studies show the good performance of the proposal, which is further illustrated through an application to a real-data set.

Download Full-text

A new diagnostic accuracy measure and cut-point selection criterion

Statistical Methods in Medical Research ◽

10.1177/0962280215611631 ◽

2015 ◽

Vol 26 (6) ◽

pp. 2832-2852 ◽

Cited By ~ 4

Author(s):

Tuochuan Dong ◽

Kristopher Attwood ◽

Alan Hutson ◽

Song Liu ◽

Lili Tian

Keyword(s):

Diagnostic Accuracy ◽

Selection Criterion ◽

Real Data ◽

Youden Index ◽

Point Selection ◽

Data Set ◽

Cut Points ◽

Diagnostic Measures ◽

Accuracy Measure ◽

Classification Information

Most diagnostic accuracy measures and criteria for selecting optimal cut-points are only applicable to diseases with binary or three stages. Currently, there exist two diagnostic measures for diseases with general k stages: the hypervolume under the manifold and the generalized Youden index. While hypervolume under the manifold cannot be used for cut-points selection, generalized Youden index is only defined upon correct classification rates. This paper proposes a new measure named maximum absolute determinant for diseases with k stages ([Formula: see text]). This comprehensive new measure utilizes all the available classification information and serves as a cut-points selection criterion as well. Both the geometric and probabilistic interpretations for the new measure are examined. Power and simulation studies are carried out to investigate its performance as a measure of diagnostic accuracy as well as cut-points selection criterion. A real data set from Alzheimer’s Disease Neuroimaging Initiative is analyzed using the proposed maximum absolute determinant.

Download Full-text

Classifying exoplanet candidates with convolutional neural networks: application to the Next Generation Transit Survey

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stz2058 ◽

2019 ◽

Vol 488 (4) ◽

pp. 5232-5250 ◽

Cited By ~ 2

Author(s):

Alexander Chaushev ◽

Liam Raynard ◽

Michael R Goad ◽

Philipp Eigmüller ◽

David J Armstrong ◽

...

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Network Performance ◽

Area Under The Curve ◽

Simulated Data ◽

Real Data ◽

Training Data ◽

Next Generation ◽

Data Set ◽

Time Required

ABSTRACT Vetting of exoplanet candidates in transit surveys is a manual process, which suffers from a large number of false positives and a lack of consistency. Previous work has shown that convolutional neural networks (CNN) provide an efficient solution to these problems. Here, we apply a CNN to classify planet candidates from the Next Generation Transit Survey (NGTS). For training data sets we compare both real data with injected planetary transits and fully simulated data, as well as how their different compositions affect network performance. We show that fewer hand labelled light curves can be utilized, while still achieving competitive results. With our best model, we achieve an area under the curve (AUC) score of $(95.6\pm {0.2}){{\ \rm per\ cent}}$ and an accuracy of $(88.5\pm {0.3}){{\ \rm per\ cent}}$ on our unseen test data, as well as $(76.5\pm {0.4}){{\ \rm per\ cent}}$ and $(74.6\pm {1.1}){{\ \rm per\ cent}}$ in comparison to our existing manual classifications. The neural network recovers 13 out of 14 confirmed planets observed by NGTS, with high probability. We use simulated data to show that the overall network performance is resilient to mislabelling of the training data set, a problem that might arise due to unidentified, low signal-to-noise transits. Using a CNN, the time required for vetting can be reduced by half, while still recovering the vast majority of manually flagged candidates. In addition, we identify many new candidates with high probabilities which were not flagged by human vetters.

Download Full-text