When Nonresponse Mechanisms Change: Effects on Trends and Group Comparisons in International Large-Scale Assessments

Karoline A. Sachse; Nicole Mahler; Steffi Pohl

doi:10.1177/0013164419829196

When Nonresponse Mechanisms Change: Effects on Trends and Group Comparisons in International Large-Scale Assessments

Educational and Psychological Measurement ◽

10.1177/0013164419829196 ◽

2019 ◽

Vol 79 (4) ◽

pp. 699-726 ◽

Cited By ~ 2

Author(s):

Karoline A. Sachse ◽

Nicole Mahler ◽

Steffi Pohl

Keyword(s):

Missing Data ◽

Large Scale ◽

Missing Values ◽

Parameter Estimates ◽

Assessment Data ◽

Change Over Time ◽

International Student Assessment ◽

Large Scale Assessments ◽

Nonignorable Missing ◽

Over Time

Mechanisms causing item nonresponses in large-scale assessments are often said to be nonignorable. Parameter estimates can be biased if nonignorable missing data mechanisms are not adequately modeled. In trend analyses, it is plausible for the missing data mechanism and the percentage of missing values to change over time. In this article, we investigated (a) the extent to which the missing data mechanism and the percentage of missing values changed over time in real large-scale assessment data, (b) how different approaches for dealing with missing data performed under such conditions, and (c) the practical implications for trend estimates. These issues are highly relevant because the conclusions hold for all kinds of group mean differences in large-scale assessments. In a reanalysis of PISA (Programme for International Student Assessment) data from 35 OECD countries, we found that missing data mechanisms and numbers of missing values varied considerably across time points, countries, and domains. In a simulation study, we generated data in which we allowed the missing data mechanism and the amount of missing data to change over time. We showed that the trend estimates were biased if differences in the missing-data mechanisms were not taken into account, in our case, when omissions were scored as wrong, when omissions were ignored, or when model-based approaches assuming a constant missing data mechanism over time were used. The results suggest that the most accurate estimates can be obtained from the application of multiple group models for nonignorable missing values when the amounts of missing data and the missing data mechanisms changed over time. In an empirical example, we furthermore showed that the large decline in PISA reading literacy in Ireland in 2009 was reduced when we estimated trends using missing data treatments that accounted for changes in missing data mechanisms.

Download Full-text

Examining Change over Time in International Large-Scale Assessments: Lessons Learned from PISA

The SAGE Handbook of Comparative Studies in Education ◽

10.4135/9781526470379.n14 ◽

2019 ◽

pp. 243-257

Author(s):

Christine Sälzer ◽

Manfred Prenzel

Keyword(s):

Large Scale ◽

Lessons Learned ◽

Change Over Time ◽

Large Scale Assessments ◽

Over Time

Download Full-text

Performance of missing data approaches under nonignorable missing data conditions

Methodology ◽

10.5964/meth.2805 ◽

2020 ◽

Vol 16 (2) ◽

pp. 147-165 ◽

Cited By ~ 2

Author(s):

Steffi Pohl ◽

Benjamin Becker

Keyword(s):

Missing Data ◽

Missing Values ◽

Parameter Estimates ◽

Nonignorable Missing Data ◽

Specific Data ◽

Unbiased Estimates ◽

Pisa Data ◽

Nonignorable Missing ◽

Biased Estimates ◽

The Rasch Model

Approaches for dealing with item omission include incorrect scoring, ignoring missing values, and approaches for nonignorable missing values and have only been evaluated for certain forms of nonignorability. In this paper we investigate the performance of these approaches for various conditions of nonignorability, that is, when the missing response depends on i) the item response, ii) a latent missing propensity, or iii) both. No approach results in unbiased parameter estimates of the Rasch model under all missing data mechanisms. Incorrect scoring only results in unbiased estimates under very specific data constellations of missing mechanisms i) and iii). The approach for nonignorable missing values only results in unbiased estimates under condition ii). Ignoring results in slightly more biased estimates than the approach for nonignorable missing values, while the latter also indicates the presence of nonignorablity under all simulated conditions. We illustrate the results in an empirical example on PISA data.

Download Full-text

Missing Data - Better "Not to Have Them", but What If You Do? (Part 1)

Marketing ZFP ◽

10.15358/0344-1369-2019-4-21 ◽

2019 ◽

Vol 41 (4) ◽

pp. 21-32

Author(s):

Dirk Temme ◽

Sarah Jensen

Keyword(s):

Missing Data ◽

Statistical Power ◽

Missing Values ◽

Graphical Representation ◽

Marketing Research ◽

Likelihood Estimation ◽

Parameter Estimates ◽

Full Information Maximum Likelihood ◽

Definition Of ◽

Traditional Approaches

Missing values are ubiquitous in empirical marketing research. If missing data are not dealt with properly, this can lead to a loss of statistical power and distorted parameter estimates. While traditional approaches for handling missing data (e.g., listwise deletion) are still widely used, researchers can nowadays choose among various advanced techniques such as multiple imputation analysis or full-information maximum likelihood estimation. Due to the available software, using these modern missing data methods does not pose a major obstacle. Still, their application requires a sound understanding of the prerequisites and limitations of these methods as well as a deeper understanding of the processes that have led to missing values in an empirical study. This article is Part 1 and first introduces Rubin’s classical definition of missing data mechanisms and an alternative, variable-based taxonomy, which provides a graphical representation. Secondly, a selection of visualization tools available in different R packages for the description and exploration of missing data structures is presented.

Download Full-text

Kernel weighted least square approach for imputing missing values of metabolomics data

Scientific Reports ◽

10.1038/s41598-021-90654-0 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Nishith Kumar ◽

Md. Aminul Hoque ◽

Masahiro Sugimoto

Keyword(s):

Missing Data ◽

Large Scale ◽

Missing Values ◽

Kernel Weight ◽

Least Square ◽

Data Matrix ◽

Data Imputation ◽

Metabolomics Data ◽

Missing Value ◽

Missing Data Imputation

AbstractMass spectrometry is a modern and sophisticated high-throughput analytical technique that enables large-scale metabolomic analyses. It yields a high-dimensional large-scale matrix (samples × metabolites) of quantified data that often contain missing cells in the data matrix as well as outliers that originate for several reasons, including technical and biological sources. Although several missing data imputation techniques are described in the literature, all conventional existing techniques only solve the missing value problems. They do not relieve the problems of outliers. Therefore, outliers in the dataset decrease the accuracy of the imputation. We developed a new kernel weight function-based proposed missing data imputation technique that resolves the problems of missing values and outliers. We evaluated the performance of the proposed method and other conventional and recently developed missing imputation techniques using both artificially generated data and experimentally measured data analysis in both the absence and presence of different rates of outliers. Performances based on both artificial data and real metabolomics data indicate the superiority of our proposed kernel weight-based missing data imputation technique to the existing alternatives. For user convenience, an R package of the proposed kernel weight-based missing value imputation technique was developed, which is available at https://github.com/NishithPaul/tWLSA.

Download Full-text

Semiparametric inverse propensity weighting for nonignorable missing data

Biometrika ◽

10.1093/biomet/asv071 ◽

2016 ◽

Vol 103 (1) ◽

pp. 175-187 ◽

Cited By ~ 31

Author(s):

Jun Shao ◽

Lei Wang

Keyword(s):

Missing Data ◽

Missing Values ◽

Generalized Method Of Moments ◽

Estimating Equations ◽

Real Data ◽

Population Parameters ◽

Finite Sample ◽

External Data ◽

Nonignorable Missing ◽

Inverse Propensity Weighting

Abstract To estimate unknown population parameters based on data having nonignorable missing values with a semiparametric exponential tilting propensity, Kim & Yu (2011) assumed that the tilting parameter is known or can be estimated from external data, in order to avoid the identifiability issue. To remove this serious limitation on the methodology, we use an instrument, i.e., a covariate related to the study variable but unrelated to the missing data propensity, to construct some estimating equations. Because these estimating equations are semiparametric, we profile the nonparametric component using a kernel-type estimator and then estimate the tilting parameter based on the profiled estimating equations and the generalized method of moments. Once the tilting parameter is estimated, so is the propensity, and then other population parameters can be estimated using the inverse propensity weighting approach. Consistency and asymptotic normality of the proposed estimators are established. The finite-sample performance of the estimators is studied through simulation, and a real-data example is also presented.

Download Full-text

Moral Religiosities: How Morality Structures Religious Understandings during the Transition to Adulthood

Sociology of Religion ◽

10.1093/socrel/sraa025 ◽

2020 ◽

Author(s):

Michael Rotolo

Keyword(s):

Transition To Adulthood ◽

Large Scale ◽

Religious Practice ◽

Interview Data ◽

National Study ◽

Religious Change ◽

Change Over Time ◽

Ideal Types ◽

Moral Orientations ◽

Over Time

Abstract Religiosity remains an important sociological concept, from assessing religion’s effects on various outcomes to describing large-scale religious change. And yet conceptualizing religiosity—as a measure of intensity of religious practice—requires accounting for how respondents understand religious practice. Drawing on four waves of longitudinal interview data from the National Study of Youth and Religion (NSYR), this paper examines the religious understandings of young Americans as they develop over 10 years. I find that respondents’ religious understandings are shaped by deeper moral orientations that broadly structure their lives. From these moral orientations, I theorize four ideal types of religious practitioners that help explain complex patterns of religiosity in America—the Congregant, the Believer, the Spiritualist, and the Metaphysician. Recognizing the moral orders that structure young Americans’ religious understandings opens new pathways for theorizing religion’s influence and change over time.

Download Full-text

Digitale ontsluiting van het Centraal Archief Bijzondere Rechtspleging : Mogelijkheden en onmogelijkheden

Tijdschrift voor Geschiedenis ◽

10.5117/tvgesch2020.2.007.tame ◽

2020 ◽

Vol 133 (2) ◽

pp. 303-324

Author(s):

Ismee Tames

Keyword(s):

The Netherlands ◽

Large Scale ◽

Pilot Project ◽

Second World War ◽

World War ◽

Historical Sources ◽

Change Over Time ◽

Multiple Meanings ◽

Second World ◽

Over Time

Abstract Digital Access to the Legal Files of those tried for Nazi collaboration in the Netherlands: Possibilities and ImpossibilitiesThis article reflects on the findings of a pilot project called Triado that digitized a sample of the 4km of legal files created by the Special Jurisdiction for investigating Dutch Nazi collaboration (CABR) in the years after the Second World War. We show that large scale digitization may help to analyze complex historical sources in new ways, thus deepening our understanding of the consequences of war and genocide. However, this can be achieved only if all specialists involved develop ways to deal with ambiguity in the sources: instead of disambiguation we need mixed approaches that allow for data to have multiple meanings and for interpretation of meaning to change over time. This article offers suggestions and gives a brief overview of some of the possibilities for researchers and lay users of digitized historical sources.

Download Full-text

PATTERN RECOGNITION OF LONGITUDINAL TRIAL DATA WITH NONIGNORABLE MISSINGNESS: AN EMPIRICAL CASE STUDY

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622009003508 ◽

2009 ◽

Vol 08 (03) ◽

pp. 491-513 ◽

Cited By ~ 19

Author(s):

HUA FANG ◽

KIMBERLY ANDREWS ESPY ◽

MARIA L. RIZZO ◽

CHRISTIAN STOPP ◽

SANDRA A. WIEBE ◽

...

Keyword(s):

Pattern Recognition ◽

Missing Data ◽

Growth Pattern ◽

Growth Patterns ◽

Trial Data ◽

Parameter Estimates ◽

Nonignorable Missing Data ◽

Fuzzy Clustering Method ◽

Nonignorable Missing

Methods for identifying meaningful growth patterns of longitudinal trial data with both nonignorable intermittent and drop-out missingness are rare. In this study, a combined approach with statistical and data mining techniques is utilized to address the nonignorable missing data issue in growth pattern recognition. First, a parallel mixture model is proposed to model the nonignorable missing information from a real-world patient-oriented study and concurrently to estimate the growth trajectories of participants. Then, based on individual growth parameter estimates and their auxiliary feature attributes, a fuzzy clustering method is incorporated to identify the growth patterns. This case study demonstrates that the combined multi-step approach can achieve both statistical generality and computational efficiency for growth pattern recognition in longitudinal studies with nonignorable missing data.

Download Full-text

Assessment Background: What PISA Measures and How

Improving a Country’s Education ◽

10.1007/978-3-030-59031-4_12 ◽

2020 ◽

pp. 249-263

Author(s):

Luisa Araújo ◽

Patrícia Costa ◽

Nuno Crato

Keyword(s):

Student Performance ◽

Large Scale ◽

International Student ◽

Student Assessment ◽

Short Description ◽

School Characteristics ◽

Technical Aspects ◽

Performance Levels ◽

International Student Assessment ◽

Large Scale Assessments

AbstractThis chapter provides a short description of what the Programme for International Student Assessment (PISA) measures and how it measures it. First, it details the concepts associated with the measurement of student performance and the concepts associated with capturing student and school characteristics and explains how they compare with some other International Large-Scale Assessments (ILSA). Second, it provides information on the assessment of reading, the main domain in PISA 2018. Third, it provides information on the technical aspects of the measurements in PISA. Lastly, it offers specific examples of PISA 2018 cognitive items, corresponding domains (mathematics, science, and reading), and related performance levels.

Download Full-text

Machine learning based imputation techniques for estimating phylogenetic trees from incomplete distance matrices

10.1101/744789 ◽

2019 ◽

Author(s):

Ananya Bhattacharjee ◽

Md. Shamsuzzoha Bayzid

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Missing Data ◽

Phylogenetic Trees ◽

Large Scale ◽

Missing Values ◽

Gene Tree ◽

Estimation Methods ◽

Learning Technique ◽

Distance Matrices

AbstractBackgroundDue to the recent advances in sequencing technologies and species tree estimation methods capable of taking gene tree discordance into account, notable progress has been achieved in constructing large scale phylogenetic trees from genome wide data. However, substantial challenges remain in leveraging this huge amount of molecular data. One of the foremost among these challenges is the need for efficient tools that can handle missing data. Popular distance-based methods such as neighbor joining and UPGMA require that the input distance matrix does not contain any missing values.ResultsWe introduce two highly accurate machine learning based distance imputation techniques. One of our approaches is based on matrix factorization, and the other one is an autoencoder based deep learning technique. We evaluate these two techniques on a collection of simulated and biological datasets, and show that our techniques match or improve upon the best alternate techniques for distance imputation. Moreover, our proposed techniques can handle substantial amount of missing data, to the extent where the best alternate methods fail.ConclusionsThis study shows for the first time the power and feasibility of applying deep learning techniques for imputing distance matrices. The autoencoder based deep learning technique is highly accurate and scalable to large dataset. We have made these techniques freely available as a cross-platform software (available at https://github.com/Ananya-Bhattacharjee/ImputeDistances).

Download Full-text