Estimating Measurement Error in SIPP Annual Job Earnings: A Comparison of Census Bureau Survey and SSA Administrative Data

Integration of statistical and administrative agricultural data from Namibia

Statistical Journal of the IAOS ◽

10.3233/sji-200634 ◽

2021 ◽

pp. 1-22

Author(s):

Emily Berg ◽

Johgho Im ◽

Zhengyuan Zhu ◽

Colin Lewis-Beck ◽

Jie Li

Keyword(s):

Measurement Error ◽

Data Collection ◽

Administrative Data ◽

Statistical Data ◽

Data Sources ◽

Extension Programs ◽

Multiple Data ◽

Administrative Agencies ◽

Crop Area ◽

Using Data

Statistical and administrative agencies often collect information on related parameters. Discrepancies between estimates from distinct data sources can arise due to differences in definitions, reference periods, and data collection protocols. Integrating statistical data with administrative data is appealing for saving data collection costs, reducing respondent burden, and improving the coherence of estimates produced by statistical and administrative agencies. Model based techniques, such as small area estimation and measurement error models, for combining multiple data sources have benefits of transparency, reproducibility, and the ability to provide an estimated uncertainty. Issues associated with integrating statistical data with administrative data are discussed in the context of data from Namibia. The national statistical agency in Namibia produces estimates of crop area using data from probability samples. Simultaneously, the Namibia Ministry of Agriculture, Water, and Forestry obtains crop area estimates through extension programs. We illustrate the use of a structural measurement error model for the purpose of synthesizing the administrative and survey data to form a unified estimate of crop area. Limitations on the available data preclude us from conducting a genuine, thorough application. Nonetheless, our illustration of methodology holds potential use for a general practitioner.

Download Full-text

Earnings Dynamics and Measurement Error in Matched Survey and Administrative Data

SSRN Electronic Journal ◽

10.2139/ssrn.2854596 ◽

2016 ◽

Author(s):

Dean Hyslop ◽

Wilbur Townsend

Keyword(s):

Measurement Error ◽

Administrative Data ◽

Earnings Dynamics

Download Full-text

Working with federal government agencies to unlock administrative data

International Journal for Population Data Science ◽

10.23889/ijpds.v3i5.1064 ◽

2018 ◽

Vol 3 (5) ◽

Author(s):

Jennifer Auer

Keyword(s):

Administrative Data ◽

Source Coding ◽

Low Cost ◽

Census Bureau ◽

Project Proposal ◽

The Public ◽

Econometric Methods ◽

Technical Specifications ◽

Data Source ◽

Link Information

Federal administrative data is a low-cost and low-burden data source for evidence-based policy making. By linking information from different surveys, or over time, researchers can achieve the sample size and variation needed for advanced econometric methods. However, the personally identifying information (PII) needed to link information means that these data re not available to the public. One solution is to provide technical specifications to the requisite agency(s) to execute the research. This paper outlines the process and pitfalls of drafting specifications for an implementing party who knows more about the data than you do. Drawing on experience from working with the U.S. Census Bureau and knowledge gained from related literatures, such as open-source coding, this paper recommends the depth of description, order of data manipulation and analysis, and requested output to make these collaborative projects successful. A federal administrative data project proposal template is offered. The paper also advises on information that federal agencies can supply to facilitate the use of these important data sources.

Download Full-text

MATCH BIAS OR NONIGNORABLE NONRESPONSE? IMPROVED IMPUTATION AND ADMINISTRATIVE DATA IN THE CPS ASEC

Journal of Survey Statistics and Methodology ◽

10.1093/jssam/smaa022 ◽

2020 ◽

Author(s):

Charles Hokayem ◽

Trivellore Raghunathan ◽

Jonathan Rothbaum

Keyword(s):

Administrative Data ◽

Current Population Survey ◽

Population Survey ◽

Census Bureau ◽

Median Income ◽

Efficiency Gains ◽

Poverty And Inequality ◽

Nonignorable Nonresponse ◽

Using Data ◽

Sequential Regression

Abstract We test an improved imputation technique, sequential regression multivariate imputation (SRMI), for the Current Population Survey Annual Social and Economic Supplement to address match bias. Furthermore, we augment the model with administrative tax data to test for nonignorable nonresponse. Using data from 2009, 2011, and 2013, we find that the current hot deck imputation used by the Census Bureau produces different distribution statistics, downward for poverty and inequality and upward for median income, relative to the SRMI model-based estimates. Our results suggest that these differences are a result of match bias, not nonignorable nonresponse. Nearly all poverty, median income, and inequality estimates are not significantly different when comparing imputation models with and without administrative data. However, there are clear efficiency gains from using administrative data.

Download Full-text

Earnings Dynamics and Measurement Error in Matched Survey and Administrative Data

Journal of Business and Economic Statistics ◽

10.1080/07350015.2018.1514308 ◽

2018 ◽

Vol 38 (2) ◽

pp. 457-469 ◽

Cited By ~ 2

Author(s):

Dean R. Hyslop ◽

Wilbur Townsend

Keyword(s):

Measurement Error ◽

Administrative Data ◽

Earnings Dynamics

Download Full-text

Estimating the density of ethnic minorities and aged people in Berlin: multivariate kernel density estimation applied to sensitive georeferenced administrative data protected via measurement error

Journal of the Royal Statistical Society Series A (Statistics in Society) ◽

10.1111/rssa.12179 ◽

2016 ◽

Vol 180 (1) ◽

pp. 161-183 ◽

Cited By ~ 5

Author(s):

Marcus Groß ◽

Ulrich Rendtel ◽

Timo Schmid ◽

Sebastian Schmon ◽

Nikos Tzavidis

Keyword(s):

Measurement Error ◽

Density Estimation ◽

Ethnic Minorities ◽

Administrative Data ◽

Kernel Density Estimation ◽

Kernel Density ◽

Aged People

Download Full-text

Imputation of Binary Treatment Variables With Measurement Error in Administrative Data

Journal of the American Statistical Association ◽

10.1198/016214505000000754 ◽

2005 ◽

Vol 100 (472) ◽

pp. 1123-1132 ◽

Cited By ~ 26

Author(s):

Recai M Yucel ◽

Alan M Zaslavsky

Keyword(s):

Measurement Error ◽

Administrative Data

Download Full-text

Measurement Error and Misclassification: A Comparison of Survey and Administrative Data

Journal of Labor Economics ◽

10.1086/513298 ◽

2007 ◽

Vol 25 (3) ◽

pp. 513-551 ◽

Cited By ~ 56

Author(s):

Arie Kapteyn ◽

Jelmer Y. Ypma

Keyword(s):

Measurement Error ◽

Administrative Data

Download Full-text

A bayesian way to correct for measurement error in drug risk estimates from EHR data.

International Journal for Population Data Science ◽

10.23889/ijpds.v3i4.1023 ◽

2018 ◽

Vol 3 (4) ◽

Cited By ~ 1

Author(s):

Maxime Lavigne ◽

Robert Goulden ◽

Bettina Habib ◽

Lawrence Joseph ◽

Nadyne Girard ◽

...

Keyword(s):

Measurement Error ◽

Administrative Data ◽

Linked Data ◽

Risk Estimates ◽

Drug Risk ◽

Hazard Ratios ◽

Exposure Measurement Error ◽

Oral Hypoglycemics ◽

The Uk

IntroductionData from electronic medical records is now readily available and records information needed in pharmacoepidemiological studies not usually found in administrative data such as risk factors and biometrics. Yet, EMR data leads to measurement error due to primary non-adherence. Bayesian bias correction could provide corrected estimates from administrative data. Objectives and ApproachWe present a method for correcting risk estimates from EMR data using linked data. In our example, we estimate the risk of cardiovascular events from oral-hypoglycemics in patients with type-2 diabetes in Boston, Quebec, and the UK between 2009 and 2012. Using linked EMR and administrative data in Quebec, we compute a positive and negative predicting value of prescription on dispensation for each class of oral-hypoglycemics. The cardiovascular risk is then analysed using a bayesian Weibull survival model adjusted for potential confounders. A similar model is then computed that accounts for exposure measurement error using the PPV and NPV. ResultsThe Quebec and Boston cohorts have similar sizes with 1197 and 2346 patients, but the UK was bigger at 41370 patients. In Quebec's data, there were important differences in PPV and NPV by class of oral-hypoglycemics with PPVs for Biguanides at 0.81, Sulphonylureas at 0.65, and others at 0.50. The pattern for NPV differed with the same classes having respectively values of 0.56, 0.97, and 0.99. Estimates from the naïve model are typical of similar analysis but compared to their correction, they were generally overprecise and biased towards the null. The adjusted estimated were adequately representing the increased uncertainty with hazard ratios for Sulphonylureas going from 1.72 (1.22, 2.41) to 3.19 (1.36, 5.93), and from 1.09 (0.86, 1.39) to 1.05 (0.45, 2.16) for no drugs Conclusion/ImplicationsBayesian adjustment for measurement error allowed us to use linked data to regenerate uncertainty and to correct the bias in our risk estimates. Our approach was impacted by the observed low predictive value of prescribing, by reduced transportability of our PPV and NPV estimates, and other sources of bias.

Download Full-text

Measurement of population income: Variants of estimating biases

Voprosy Ekonomiki ◽

10.32609/0042-8736-2020-1-127-144 ◽

2020 ◽

pp. 127-144

Author(s):

T. Yu. Cherkashina

Keyword(s):

Measurement Error ◽

Administrative Data ◽

Economic Status ◽

Personal Income ◽

Living Standards ◽

Individual Income ◽

Income Data ◽

Monitoring Survey ◽

Different Sources

Income is one of the most obvious and frequently used indicators of economic status and living standards. Surveys of households and individuals are the main sources of income data for sociologists and economists. Administrative data is added to them on a growing scale. Comparison of data obtained from different sources or surveys using different methods allows us to estimate biases, sources of errors, and demonstrates the absence of “ideal” income data in general. The review of foreign studies on this problem is supplemented by an example of calculations on data from the The Russia Longitudinal Monitoring Survey — Higher School of Economics (RLMS—HSE): we compare the compositional individual income, calculated as the sum of types of income, and the total personal income reported by respondents. The first measurement of individual incomes has turned out to be more consistent and definite, less prone to measurement error, but gives lower values of individual incomes. The differences of the total personal income reported by respondents and compositional individual income are due not so much to the inaccuracy of the summation and rounding as to “conceptual” features of understanding of personal income by some respondents. Such comparisons are necessary in order to understand the limitations of various measurements of income, grounded and reflexive choice of its specific indicators.

Download Full-text