Influence diagnostics for multivariate growth curve models

YUN LING; STEWART J. ANDERSON; RICHARD A. BILONICK; GADI WOLLSTEIN; JOEL S. SCHUMAN

doi:10.47302/jsr.2017510101

Influence diagnostics for multivariate growth curve models

Mapping Intimacies ◽

10.47302/jsr.2017510101 ◽

2017 ◽

Vol 51 (1) ◽

pp. 1-16

Author(s):

YUN LING ◽

STEWART J. ANDERSON ◽

RICHARD A. BILONICK ◽

GADI WOLLSTEIN ◽

JOEL S. SCHUMAN

Keyword(s):

General Situation ◽

Influential Observations ◽

Longitudinal Models ◽

Mixed Effect ◽

Growth Curve Models ◽

Rigorous Approach ◽

Influential Observation ◽

Influence Diagnostics ◽

Cook’S Distance ◽

Cook's Distance

Research has shown that in mixed effect longitudinal models, influential observations can have a large effect on the estimates of subject-specific parameters. Furthermore, they cannot always be detected by the classical Cook’s distance due to potentially large between subject variation. Thus, influential observations should be approached by conditioning on the subjects. However, no rigorous approach has been developed for influential observation detection for multivariate longitudinal mixed models where more than one response is measured for each subject at each time point. We propose a multivariate conditional Cook’s distance for this more general situation. Examples are given to illustrate how the influential observation in one characteristic changes the effects of both characteristics.

Download Full-text

Detecting influential observations by using a new expression of cook's distance

Communication in Statistics- Theory and Methods ◽

10.1080/03610929108830495 ◽

1991 ◽

Vol 20 (1) ◽

pp. 261-274 ◽

Cited By ~ 6

Author(s):

Takeuchi Hidekazu

Keyword(s):

Influential Observations ◽

Cook’S Distance ◽

Cook's Distance

Download Full-text

Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification

Symmetry ◽

10.3390/sym13112030 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2030

Author(s):

Ali Mohammed Baba ◽

Habshah Midi ◽

Mohd Bakri Adam ◽

Nur Haizum Abd Rahman

Keyword(s):

Regression Model ◽

Regression Models ◽

Spatial Regression ◽

Spatial Models ◽

Influential Observations ◽

Leverage Points ◽

Cook’S Distance ◽

Spatial Regression Models ◽

Cook's Distance ◽

Classical Regression

Influential observations (IOs), which are outliers in the x direction, y direction or both, remain a problem in the classical regression model fitting. Spatial regression models have a peculiar kind of outliers because they are local in nature. Spatial regression models are also not free from the effect of influential observations. Researchers have adapted some classical regression techniques to spatial models and obtained satisfactory results. However, masking or/and swamping remains a stumbling block for such methods. In this article, we obtain a measure of spatial Studentized prediction residuals that incorporate spatial information on the dependent variable and the residuals. We propose a robust spatial diagnostic plot to classify observations into regular observations, vertical outliers, good and bad leverage points using a classification based on spatial Studentized prediction residuals and spatial diagnostic potentials, which we refer to as and . Observations that fall into the vertical outliers and bad leverage points categories are referred to as IOs. Representations of some classical regression measures of diagnostic in general spatial models are presented. The commonly used diagnostic measure in spatial diagnostics, the Cook’s distance, is compared to some robust methods, (using robust and non-robust measures), and our proposed and plots. Results of our simulation study and applications to real data showed that the Cook’s distance, non-robust and robust were not very successful in detecting IOs. The suffered from the masking effect, and the robust suffered from swamping in general spatial models. Interestingly, the results showed that the proposed plot, followed by the plot, was very successful in classifying observations into the correct groups, hence correctly detecting the real IOs.

Download Full-text

Cook's distance in linear longitudinal models

Communication in Statistics- Theory and Methods ◽

10.1080/03610929808832267 ◽

1998 ◽

Vol 27 (12) ◽

pp. 2973-2983 ◽

Cited By ~ 3

Author(s):

Mousumi Banerjee

Keyword(s):

Longitudinal Models ◽

Cook’S Distance ◽

Cook's Distance

Download Full-text

Detection of Influential Observations in Spatial Regression Model Based on Outliers and Bad Leverage Classification

10.20944/preprints202108.0178.v1 ◽

2021 ◽

Author(s):

Ali Mohammed Baba ◽

Habshah Midi ◽

Mohd Bakri Adam ◽

Nur Haizum Bint Abd Rahman

Keyword(s):

Regression Model ◽

Spatial Model ◽

Spatial Regression ◽

Model Fitting ◽

Spatial Prediction ◽

Influential Observations ◽

Spatial Regression Model ◽

Cook’S Distance ◽

Cook's Distance ◽

Classical Regression

Influential Observations, which are outliers in x direction, y direction or both, remain a hitch in classical regression model fitting. Spatial regression model, with peculiar nature of outliers due to their local nature, is not free from the effect of such influential observations. Researchers have adapted some classical regression techniques to the spatial models and yielded satisfactory results. However, masking or/and swamping remain stumbling block to such methods. We obtained the spatial representation of the classical regression measures of diagnostic in general spatial model. Commonly used diagnostic measure in spatial diagnostic, the Cook's distance, is compared to some robust methods, Hi2 (using robust and non-robust measures), and classification based on generalized residuals and diagnostic generalized potentials, ISRs-Posi and ESRs-Posi, with the help of the obtained spatial prediction residuals and the spatial leverage term. Results of simulation and applications to real data have shown the advantage of the ISRs-Posi and ESRs-Posi due to classification of outliers over Cook's distance and non-robust Hsi12, which suffer from masking, and robust Hsi22 which suffer from swamping in general spatial model.

Download Full-text

Reference values for cook's distance

Communications in Statistics - Simulation and Computation ◽

10.1080/03610919608813337 ◽

1996 ◽

Vol 25 (3) ◽

pp. 691-708 ◽

Cited By ~ 23

Author(s):

Choongrak Kim ◽

Barry E. storer

Keyword(s):

Reference Values ◽

Cook’S Distance ◽

Cook's Distance

Download Full-text

Cook's Distance

Dictionary of Statistics & Methodology ◽

10.4135/9781412983907.n407 ◽

2015 ◽

Keyword(s):

Cook’S Distance ◽

Cook's Distance

Download Full-text

Cook's Distance

10.1007/springerreference_205273 ◽

2012 ◽

Keyword(s):

Cook’S Distance ◽

Cook's Distance

Download Full-text

Influential Observations and Inference in Accounting Research

The Accounting Review ◽

10.2308/accr-52396 ◽

2016 ◽

Vol 94 (6) ◽

pp. 337-364 ◽

Cited By ~ 43

Author(s):

Andrew J. Leone ◽

Miguel Minutti-Meza ◽

Charles E. Wasley

Keyword(s):

Robust Regression ◽

Extreme Values ◽

Data Availability ◽

Influence Coefficient ◽

Influential Observations ◽

Coefficient Estimates ◽

The Public ◽

Influence Diagnostics ◽

Sensitivity Tests ◽

Data Points

ABSTRACT Accounting studies often encounter observations with extreme values that can influence coefficient estimates and inferences. Two widely used approaches to address influential observations in accounting studies are winsorization and truncation. While expedient, both depend on researcher-selected cutoffs, applied on a variable-by-variable basis, which, unfortunately, can alter legitimate data points. We compare the efficacy of winsorization, truncation, influence diagnostics (Cook's Distance), and robust regression at identifying influential observations. Replication of three published accounting studies shows that the choice impacts estimates and inferences. Simulation evidence shows that winsorization and truncation are ineffective at identifying influential observations. While influence diagnostics and robust regression both outperform winsorization and truncation, overall, robust regression outperforms the other methods. Since robust regression is a theoretically appealing and easily implementable approach based on a model's residuals, we recommend that future accounting studies consider using robust regression, or at least report sensitivity tests using robust regression. JEL Classifications: C12; C13; C18; C51; C52; M41. Data Availability: Data are available from the public sources cited in the text.

Download Full-text

Deletion statistic accuracy in confirmatory factor models

Methodological Innovations ◽

10.1177/2059799120918349 ◽

2020 ◽

Vol 13 (2) ◽

pp. 205979912091834

Author(s):

Jennifer Koran ◽

Fathima Jaffari

Keyword(s):

Factor Model ◽

Model Misspecification ◽

Statistical Simulation ◽

Factor Models ◽

Case Deletion ◽

Difference Statistic ◽

Confirmatory Factor ◽

Cook’S Distance ◽

Cook's Distance ◽

Influential Cases

Social science researchers now routinely use confirmatory factor models in scale development and validation studies. Methodologists have known for some time that the results of fitting a confirmatory factor model can be unduly influenced by one or a few cases in the data. However, there has been little development and use of case diagnostics for identifying influential cases with confirmatory factor models. A few case deletion statistics have been proposed to identify influential cases in confirmatory factor models. However, these statistics have not been systematically evaluated or compared for their accuracy. This study evaluated the accuracy of three case deletion statistics found in the R package influence.SEM. The accuracy of the case deletion statistics was also compared to Mahalanobis distance, which is commonly used to screen for unusual cases in multivariate applications. A statistical simulation was used to compare the accuracy of the statistics in identifying target cases generated from a model in which variables were uncorrelated. The results showed that Likelihood distance and generalized Cook’s distance detected the target cases more effectively than the Chi-square difference statistic. The accuracy of the Likelihood distance and generalized Cook’s distance statistics was unaffected by model misspecification. The results of this study suggest that Likelihood distance and generalized Cook’s distance are more accurate under more varied conditions in identifying target cases in confirmatory factor models.

Download Full-text

A CONDITIONAL COOK'S DISTANCE TO ASSESS INFLUENCE IN AUTOREGRESSIVE MODELS

Communication in Statistics- Theory and Methods ◽

10.1081/sta-100104750 ◽

2001 ◽

Vol 30 (7) ◽

pp. 1373-1380

Author(s):

Hua Lin ◽

Andy H. Lee

Keyword(s):

Autoregressive Models ◽

Cook’S Distance ◽

Cook's Distance

Download Full-text