Measurement invariance via multigroup SEM: Issues and solutions with chi-square-difference tests.

Ke-Hai Yuan; Wai Chan

doi:10.1037/met0000080

Type I error rates and power of several versions of scaled chi-square difference tests in investigations of measurement invariance.

Psychological Methods ◽

10.1037/met0000097 ◽

2017 ◽

Vol 22 (3) ◽

pp. 467-485 ◽

Cited By ~ 4

Author(s):

Jordan Campbell Brace ◽

Victoria Savalei

Keyword(s):

Measurement Invariance ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Chi Square ◽

Type I Error Rates

Download Full-text

Measurement invariance of the Center for Epidemiologic Studies Depression Scale in parents of individuals with disabilities

Social Behavior and Personality An International Journal ◽

10.2224/sbp.8257 ◽

2019 ◽

Vol 47 (10) ◽

pp. 1-9

Author(s):

Eun-Young Park ◽

Joungmin Kim

Keyword(s):

South Korea ◽

Measurement Invariance ◽

Factor Model ◽

Panel Study ◽

Depression Scale ◽

Epidemiologic Studies ◽

Fit Indices ◽

Chi Square ◽

Using Data ◽

Invariance Model

We aimed to verify the factor model and measurement invariance of the abbreviated Center for Epidemiologic Studies Depression Scale by conducting a confirmatory factor analysis using data from 761 parents of individuals with intellectual disabilities who completed the scale as part of the 2011 Survey on the Actual Conditions of Individuals with Developmental Disabilities, South Korea, and 7,301 participants from the general population who completed the scale as part of the 2011 Welfare Panel Study and Survey by the Ministry of Health and Welfare, South Korea. We used fit indices to assess data reliability and Amos 22.0 for data analysis. According to the results, the 4-factor model had an appropriate fit to the data and the regression coefficients were significant. However, the chi-square difference test result was nonsignificant; therefore, the metric invariance model was the most appropriate measurement invariance model for the data. Implications of the findings are discussed.

Download Full-text

Longitudinal Measurement Invariance of the Brief Symptom Inventory (BSI)-18 in Psychotherapy Patients

European Journal of Psychological Assessment ◽

10.1027/1015-5759/a000480 ◽

2020 ◽

Vol 36 (1) ◽

pp. 12-18 ◽

Cited By ~ 1

Author(s):

Ruth von Brachel ◽

Angela Bieda ◽

Jürgen Margraf ◽

Gerrit Hirschfeld

Keyword(s):

Measurement Invariance ◽

Brief Symptom Inventory ◽

Chi Square ◽

Model Comparisons ◽

Longitudinal Measurement Invariance ◽

Symptom Inventory ◽

Longitudinal Measurement ◽

Confirmatory Factor ◽

Mean Differences ◽

Fit Index

Abstract. The Brief Symptom Inventory (BSI)-18 is a widely-used tool to assess changes in general distress in patients despite an ongoing debate about its factorial structure and lack of evidence for longitudinal measurement invariance (LMI). We investigated BSI-18 scores from 1,081 patients from an outpatient clinic collected after the 2nd, 6th, 10th, 18th, and 26th therapy session. Confirmatory factor analysis (CFA) was used to compare models comprising one, three, and four latent dimensions that were proposed in the literature. LMI was investigated using a series of model comparisons, based on chi-square tests, effect sizes, and changes in comparative fit index (CFI). Psychological distress diminished over the course of therapy. A four-factor structure (depression, somatic symptoms, generalized anxiety, and panic) showed the best fit to the data at all measurement occasions. The series of model comparisons showed that constraining parameters to be equal across time resulted in very small decreases in model fit that did not exceed the cutoff for the assumption of measurement in variance. Our results show that the BSI-18 is best conceptualized as a four-dimensional tool that exhibits strict longitudinal measurement invariance. Clinicians and applied researchers do not have to be concerned about the interpretation of mean differences over time.

Download Full-text

We need to change how we compute RMSEA for nested model comparisons in structural equation modeling

10.31234/osf.io/wprg8 ◽

2021 ◽

Author(s):

Victoria Savalei ◽

Jordan Brace ◽

Rachel T. Fouladi

Keyword(s):

Structural Equation Modeling ◽

Measurement Invariance ◽

Structural Equation ◽

Model Comparison ◽

Equation Modeling ◽

Future Research ◽

Chi Square ◽

Nested Model ◽

Nested Models ◽

Difference Test

Comparison of nested models is common in applications of structural equation modeling (SEM). When two models are nested, model comparison can be done via a chi-square difference test or by comparing indices of approximate fit. The advantage of fit indices is that they permit some amount of misspecification in the additional constraints imposed on the model, which is a more realistic scenario. The most popular index of approximate fit is the root mean square error of approximation (RMSEA). In this article, we argue that the dominant way of comparing RMSEA values for two nested models, which is simply taking their difference, is problematic and will often mask misfit. We instead advocate computing the RMSEA associated with the chi-square difference test. We are not the first to propose this idea, and we review numerous methodological articles that have suggested it. Nonetheless, these articles appear to have had little impact on actual practice. The modification of current practice that we call for may be particularly needed in the context of measurement invariance assessment. We illustrate the difference between the current approach and our advocated approach on three examples, where two involve multiple-group and longitudinal measurement invariance assessment and the third involves comparisons of models with different numbers of factors. We conclude with a discussion of limitations and future research directions.

Download Full-text

Examining Measurement invariance for Parental Creativity Nurturing Behaviour (PCNB) across middle-East Nations and South-east Asian Nations

10.21203/rs.3.rs-454910/v1 ◽

2021 ◽

Author(s):

Ekta Sharma ◽

Sandeep Sharma ◽

Muhammad Al-Kudah ◽

Angham Al-Tamimi

Keyword(s):

Middle East ◽

Measurement Invariance ◽

East Asian ◽

Factor Loading ◽

Model Fit ◽

Fit Indices ◽

Chi Square ◽

Model Fit Indices ◽

The One ◽

Fit Index

Abstract Background and Objectives Examining the Usefulness of the Psychological instruments / scales /framework which are developed in one nation for other nations is vital step in establishing generalization for use across nations of Parental creativity nurturing behavior , Who are going to play most important role to develop Creative potential of children in 21st century.However The instrument used to measure the parental creativity nurturing behavior and the theoretical constructs of the subject exhibits good cross nation equivalence of PCNB scale.We reviewed various forms of measurement invariance for sample of parents , systematically, scientifically analyzed, and statistically proved the measurement invariance ,across gender, Age , culture ,Language of the PCNB scale by comparing MEN and SEAN . This would benefit the Parents in middle east Nations (MEN) and South-east Asian Nations(SEAN) . It will help parents to monitor their creativity nurturing behavior (CNB) and take appropriate steps to enrich PCNB and play important role to help their child lead a creativity accompanied life in 21st century.It will also Approve the wider use and acceptability of the PCNB scale.Research design and methodBased on data of 931 { 423 & 508 parents respectively from South East Asian Nations(SEAN ,Representing Nation -India) and Middle East Nations (MEN, representing nation Jordon)} ,we used PCNB Scale(Sharma & Sharma 2021) that focused on 4 major factors that are crucial to examine and assess validity and reliability of PCNB scale. Multisampling confirmatory factor analysis was done to identify the factors and then the model fit for the construct of PCNB. The selection of India as representative nation for Southeast Asian Nations and Jordon as representative nation of Middle east nations was done randomly as both are emerging educational nations in respective geographical areas in Asia.ResultsThe scale was found to be both valid and reliable with an excellent model fit with composite reliability (CR-SEAN) (CR factor1 =0.818 , CR factor2 =0.872 , CR factor3 =0.670 , CR factor4 =0.729), factor loading (>0.4 ) , Cronbach Alpha(SEAN)=0.712 and model fit indices SEAN (comparative fit index(CFI)=0.967,Chi-square/df=1.492, GFI=0.942,TLI=0.95,RFI=0.861,RMSEA=0.059,IFI=0.961,SRMR=0.051 ).This suggests four factors model ,supporting the one-dimensional reliable construct of PCNB at baseline.The scale was found to be both valid and reliable with an excellent model fit with composite reliability (CR-MEN) (CR factor1 =0.955 , CR factor2 =0.950 , CR factor3 =0.918 , CR factor4 =0.857), factor loading (>0.4 ) , Cronbach Alpha(MEN)=0.57 and model fit indices MEN (comparative fit index(CFI)=0.967,Chi-square/df=1.492, GFI=0.942,TLI=0.95,RFI=0.861,RMSEA=0.059,IFI=0.961,SRMR=0.038 ).This suggests four factors ,supporting the one-dimensional reliable construct of PCNB at baseline.Discussion and implicationsPCNB is a valid and reliable measure of creativity nurturing behavior of parents and thus enables a comprehensive evaluation of measure of creativity nurturing behavior in Parents. This scale would prove to be an important tool for assessment of parental creativity nurturing behavior and their parenting style. Once put to use on large-scale, would benefit and help the parents to identify their behavior and nurture creativity behavior in children well before they join the school, and would continue to nurture creativity even during the schooling and further. Basically this would help parents identify the need to get counseled (trained) formally or informally to nurture creativity in their child. SignificancePCNB Scale is a valid and reliable scale that would be globally useful for assessment of Parents creativity nurturing behavior (PCNB) across gender, countries, cultures, and age. This scale is available with strong psychometric properties and can prove useful for planning parent counseling programs to help them exhibit creativity nurturing behavior, and thus contribute to society by identifying parents to nurture creativity in children for finding creative solutions to different problems and challenges that world faces in the times to come. This could also help to develop new teaching methods and pedagogies in the direction of creativity nurturing. This research will add to Current sciences of Creativity nurturing behavior of parents.

Download Full-text

Evaluating Equivalence Testing Methods for Measurement Invariance

10.31234/osf.io/jue4a ◽

2020 ◽

Author(s):

Alyssa Counsell ◽

Rob Cribbie

Keyword(s):

Measurement Invariance ◽

Goodness Of Fit ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Equivalence Testing ◽

Fit Indices ◽

Equivalence Test ◽

Chi Square ◽

Difference Test

Measurement Invariance (MI) is often concluded from a nonsignificant chi-square difference test. Researchers have also proposed using change in goodness-of-fit indices (ΔGOFs) instead. Both of these commonly used methods for testing MI have important limitations. To combat these issues, To combat these issues, it was proposed using an equivalence test (EQ) to replace the chi-square difference test commonly used to test MI. Due to concerns with the EQ's power, and adjusted version (EQ-A) was created, but provides little evaluation of either procedure. The current study evaluated the Type I error and power of both the EQ and EQ-A, and compared their performance to that of the traditional chi-square difference test and ΔGOFs. The EQ was the only procedure that maintained empirical error rates below the nominal alpha level. Results also highlight that the EQ requires larger sample sizes than traditional difference-based approaches or using equivalence bounds based on larger than conventional RMSEA values (e.g., > .05) to ensure adequate power rates. We do not recommend the proposed adjustment (EQ-A) over the EQ.

Download Full-text

Stability of the Data-Model Fit over Increasing Levels of Factorial Invariance for Different Features of Design in Factor Analysis

Engineering, Technology & Applied Science Research ◽

10.48084/etasr.4047 ◽

2021 ◽

Vol 11 (2) ◽

pp. 6849-6856

Author(s):

D. Almaleki

Keyword(s):

Factor Analysis ◽

Measurement Invariance ◽

Empirical Evaluation ◽

Factorial Invariance ◽

Chi Square ◽

Model Stability ◽

Configural Invariance ◽

The Stability ◽

Fit Index ◽

Invariance Model

The aim of this study is to provide an empirical evaluation of the influence of different aspects of design in the context of factor analysis in terms of model stability. The overall model stability of factor solutions was evaluated by the examination of the order for testing three levels of Measurement Invariance (MIV) starting with configural invariance (model 0). Model testing was evaluated by the Chi-square difference test (Δx2) between two groups, and Root Mean Square Error of Approximation (RMSEA), Comparative Fit Index (CFI), and Tucker-Lewis Index (TLI). Factorial invariance results revealed that the stability of the models was varying over increasing levels of measurement as a function of Variable-To-Factor (VTF) ratio, Subject-To-Variable (STV) ratio, and their interactions. There were invariant factor loadings and invariant intercepts among the groups indicating that measurement invariance was achieved. For VTF ratios 4:1, 7:1, and 10:1, the models started to show stability over the levels of measurement when the STV ratio was 4:1. Yet, the frequency of stability models over 1000 replications increased (from 77% to 91%) as the STV ratio increased. The models showed more stability at or above 32:1 STV.

Download Full-text

Evaluating Equivalence Testing Methods for Measurement Invariance

10.31234/osf.io/rd5sc ◽

2020 ◽

Author(s):

Alyssa Counsell ◽

Rob Cribbie ◽

David B Flora

Keyword(s):

Measurement Invariance ◽

Goodness Of Fit ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Equivalence Testing ◽

Fit Indices ◽

Equivalence Test ◽

Chi Square ◽

Difference Test

Measurement Invariance (MI) is often concluded from a nonsignificant chi-square difference test. Researchers have also proposed using change in goodness of fit indices (∆GOFs) instead. Both of these commonly used methods for testing MI have important limitations. To combat these issues, Yuan and Chan (2016) proposed using an equivalence test (EQ) to replace the chi-square difference test commonly used to test MI. Due to their concerns with the EQ’s power, Yuan and Chan also created an adjusted version (EQ-A), but provide little evaluation of either procedure. The current study evaluated the Type I error and power of both the EQ and EQ-A, and compared their performance to that of the traditional chi-square difference test and ∆GOFs. The EQ for nested model comparisons was the only procedure that always maintained empirical error rates below the nominal alpha level. Results also highlight that the EQ requires larger sample sizes than traditional difference-based approaches or using equivalence bounds based on larger than conventional RMSEA values (e.g., > .05) to ensure adequate power rates. We do not recommend Yuan and Chan’s proposed adjustment (EQ-A) over the EQ.

Download Full-text

The Impact of Model Parameterization and Estimation Methods on Tests of Measurement Invariance With Ordered Polytomous Data

Educational and Psychological Measurement ◽

10.1177/0013164416683754 ◽

2017 ◽

Vol 78 (2) ◽

pp. 272-296 ◽

Cited By ~ 5

Author(s):

Natalie A. Koziol ◽

James A. Bovaird

Keyword(s):

Maximum Likelihood ◽

Measurement Invariance ◽

Factor Model ◽

Weighted Least Squares ◽

Estimation Methods ◽

Type I ◽

Multiple Model ◽

Chi Square ◽

Testing Procedures ◽

The Impact

Evaluations of measurement invariance provide essential construct validity evidence—a prerequisite for seeking meaning in psychological and educational research and ensuring fair testing procedures in high-stakes settings. However, the quality of such evidence is partly dependent on the validity of the resulting statistical conclusions. Type I or Type II errors can render measurement invariance conclusions meaningless. The present study used Monte Carlo simulation methods to compare the effects of multiple model parameterizations (linear factor model, Tobit factor model, and categorical factor model) and estimators (maximum likelihood [ML], robust maximum likelihood [MLR], and weighted least squares mean and variance-adjusted [WLSMV]) on the performance of the chi-square test for the exact-fit hypothesis and chi-square and likelihood ratio difference tests for the equal-fit hypothesis for evaluating measurement invariance with ordered polytomous data. The test statistics were examined under multiple generation conditions that varied according to the degree of metric noninvariance, the size of the sample, the magnitude of the factor loadings, and the distribution of the observed item responses. The categorical factor model with WLSMV estimation performed best for evaluating overall model fit, and the categorical factor model with ML and MLR estimation performed best for evaluating change in fit. Results from this study should be used to inform the modeling decisions of applied researchers. However, no single analysis combination can be recommended for all situations. Therefore, it is essential that researchers consider the context and purpose of their study.

Download Full-text

Supplemental Material for Type I Error Rates and Power of Several Versions of Scaled Chi-Square Difference Tests in Investigations of Measurement Invariance

Psychological Methods ◽

10.1037/met0000097.supp ◽

2016 ◽

Keyword(s):

Measurement Invariance ◽

Type I Error ◽

Error Rates ◽

Type I ◽

Chi Square ◽

Type I Error Rates

Download Full-text