scholarly journals Stochastic Revealed Preferences with Measurement Error

Author(s):  
Victor H Aguiar ◽  
Nail Kashaev

Abstract A long-standing question about consumer behaviour is whether individuals’ observed purchase decisions satisfy the revealed preference (RP) axioms of the utility maximization theory (UMT). Researchers using survey or experimental panel data sets on prices and consumption to answer this question face the well-known problem of measurement error. We show that ignoring measurement error in the RP approach may lead to overrejection of the UMT. To solve this problem, we propose a new statistical RP framework for consumption panel data sets that allows for testing the UMT in the presence of measurement error. Our test is applicable to all consumer models that can be characterized by their first-order conditions. Our approach is non-parametric, allows for unrestricted heterogeneity in preferences and requires only a centring condition on measurement error. We develop two applications that provide new evidence about the UMT. First, we find support in a survey data set for the dynamic and time-consistent UMT in single-individual households, in the presence of nonclassical measurement error in consumption. In the second application, we cannot reject the static UMT in a widely used experimental data set in which measurement error in prices is assumed to be the result of price misperception due to the experimental design. The first finding stands in contrast to the conclusions drawn from the deterministic RP test of Browning (1989, International Economic Review, 979–992). The second finding reverses the conclusions drawn from the deterministic RP test of Afriat (1967, International Economic Review, 8, 6–77) and Varian (1982, Econometrica, 945–973).

2003 ◽  
Vol 3 (1) ◽  
Author(s):  
Matthew E Kahn

Abstract Under communism, Eastern Europe's cities were significantly more polluted than their Western European counterparts. An unintended consequence of communism's decline is to improve urban environmental quality. This paper uses several new data sets to measure these gains. National level data are used to document the extent of convergence across nations in sulfur dioxide and carbon dioxide emissions. Based on a panel data set from the Czech Republic, Hungary and Poland, ambient sulfur dioxide levels have fallen both because of composition and technique effects. The incidence of this local public good improvement is analyzed.


2017 ◽  
Vol 3 (1) ◽  
Author(s):  
Dora Matzke ◽  
Alexander Ly ◽  
Ravi Selker ◽  
Wouter D. Weeda ◽  
Benjamin Scheibehenne ◽  
...  

Whenever parameter estimates are uncertain or observations are contaminated by measurement error, the Pearson correlation coefficient can severely underestimate the true strength of an association. Various approaches exist for inferring the correlation in the presence of estimation uncertainty and measurement error, but none are routinely applied in psychological research. Here we focus on a Bayesian hierarchical model proposed by Behseta, Berdyyeva, Olson, and Kass (2009) that allows researchers to infer the underlying correlation between error-contaminated observations. We show that this approach may be also applied to obtain the underlying correlation between uncertain parameter estimates as well as the correlation between uncertain parameter estimates and noisy observations. We illustrate the Bayesian modeling of correlations with two empirical data sets; in each data set, we first infer the posterior distribution of the underlying correlation and then compute Bayes factors to quantify the evidence that the data provide for the presence of an association.


Author(s):  
Samer Madanat ◽  
Hee Cheol Shin

Pavement distress progression models predict the extent of a distress on pavement sections as a function of age, design characteristics, traffic loads and environmental factors. These models are usually developed using data from in-service facilities to calibrate the parameters of mechanistic deterioration models. The data used for the statistical estimation of such models consist of observations of pavements for which the distress has already appeared. Unfortunately, common statistical methods, when applied to such data sets, produce biased and inconsistent model parameters. This type of bias is known as selectivity bias, and it results from the fact that less durable pavement sections are over-represented in the sample used for model estimation. A joint pavement distress initiation and progression model, consisting of a discrete model of distress initiation and a continuous model of pavement progression is presented. This approach explicitly accounts for the self-selected nature of the sample used in developing the progression model, through the use of appropriate correction terms. Moreover, previous research is extended by accounting for the potential presence of unobserved heterogeneity in the model, which is related to the use of a panel data set for model estimation. This is achieved by using a random effects specification for both the discrete and continuous models. An empirical case study demonstrates the application of this approach for highway pavement cracking models.


2020 ◽  
Author(s):  
KI-Hun Kim ◽  
Kwang-Jae Kim

BACKGROUND A lifelogs-based wellness index (LWI) is a function to calculate wellness scores from health behavior lifelogs such as daily walking steps and sleep time collected through smartphones. A wellness score intuitively shows a user of a smart wellness service the overall condition of health behaviors. LWI development includes LWI estimation (i.e., estimating coefficients in LWI with data). A panel data set of health behavior lifelogs allows LWI estimation to control for variables unobserved in LWI and hence to be less biased. Such panel data sets are likely to have missing data due to various random events of daily life (e.g., smart devices stop collecting data when they are out of batteries). Missing data can introduce the biases to LWI coefficients. Thus, the choice of appropriate missing data handling method is important to reduce the biases in LWI estimation with a panel data set of health behavior lifelogs. However, relevant studies are scarce in the literature. OBJECTIVE This research aims to identify a suitable missing data handling method for LWI estimation with panel data. Six representative missing data handling methods (i.e., listwise deletion (LD), mean imputation, Expectation-Maximization (EM) based multiple imputation, Predictive-Mean Matching (PMM) based multiple imputation, k-Nearest Neighbors (k-NN) based imputation, and Low-rank Approximation (LA) based imputation) are comparatively evaluated through the simulation of an existing LWI development case. METHODS A panel data set of health behavior lifelogs collected in the existing LWI development case was transformed into a reference data set. 200 simulated data sets were generated by randomly introducing missing data to the reference data set at each of missingness proportions from 1% to 80%. The six methods were applied to transform the simulated data sets into complete data sets by handling missing data. Coefficients in a linear LWI, a linear function, were estimated with each of all the complete data sets by following the case. Coefficient biases of the six methods were calculated by comparing the estimated coefficient values with reference values estimated with the reference data set. RESULTS Based on the coefficient biases, the superior methods changed according to the missingness proportion: LA based imputation, PMM based multiple imputation, and EM based multiple imputation for 1% to 30% missingness proportions; LA based imputation and PMM based multiple imputation for 31% to 60%; and only LA based imputation for over 60%. CONCLUSIONS LA based imputation was superior among the six methods regardless of the missingness proportion. This superiority is generalizable for other panel data sets of health behavior lifelogs because existing works have verified their low-rank nature where LA based imputation works well. This result will guide the missing data handling to reduce the coefficient biases in new development cases of linear LWIs with panel data.


2019 ◽  
Vol 28 (5) ◽  
pp. 558-581 ◽  
Author(s):  
Martin Abel

Abstract Using South Africa’s first nationally representative panel data set, I find that the presence of pension recipients in the household reduces the probability of employment of both previously employed and unemployed prime-aged adults. Exploiting institutional features of the disability grant to isolate the pension’s income effect suggests that the effects operate through the income mechanism. By contrast, there is no evidence that pensioners enable household members to work by providing childcare as concluded by previous studies.


1986 ◽  
Vol 6 (4) ◽  
pp. 211-226 ◽  
Author(s):  
Kenneth J. Ottenbacher

Single-subject and time-series designs have recently been advocated as a preferred method of examining clinical change in individual patients. Data from single-subject designs are frequently analyzed by means of graphic presentation and visual inspection. The presence of serial dependency or autocorrelation in data collected from a single individual can reduce the reliability and accuracy of visual inferences. Fifty-four data paths from single-subject research published in the occupational therapy literature were reviewed to determine the degree of serial dependency present in each data set. The results revealed that a large portion (41 %) of the data sets contained a significant degree of autocorrelation. The implications of a high degree of serial dependency in relation to data analysis and interpretation are discussed, and methods to reduce the effect of serial dependency are suggested.


10.2196/20597 ◽  
2020 ◽  
Vol 8 (12) ◽  
pp. e20597
Author(s):  
Ki-Hun Kim ◽  
Kwang-Jae Kim

Background A lifelogs-based wellness index (LWI) is a function for calculating wellness scores based on health behavior lifelogs (eg, daily walking steps and sleep times collected via a smartwatch). A wellness score intuitively shows the users of smart wellness services the overall condition of their health behaviors. LWI development includes estimation (ie, estimating coefficients in LWI with data). A panel data set comprising health behavior lifelogs allows LWI estimation to control for unobserved variables, thereby resulting in less bias. However, these data sets typically have missing data due to events that occur in daily life (eg, smart devices stop collecting data when batteries are depleted), which can introduce biases into LWI coefficients. Thus, the appropriate choice of method to handle missing data is important for reducing biases in LWI estimations with panel data. However, there is a lack of research in this area. Objective This study aims to identify a suitable missing-data handling method for LWI estimation with panel data. Methods Listwise deletion, mean imputation, expectation maximization–based multiple imputation, predictive-mean matching–based multiple imputation, k-nearest neighbors–based imputation, and low-rank approximation–based imputation were comparatively evaluated by simulating an existing case of LWI development. A panel data set comprising health behavior lifelogs of 41 college students over 4 weeks was transformed into a reference data set without any missing data. Then, 200 simulated data sets were generated by randomly introducing missing data at proportions from 1% to 80%. The missing-data handling methods were each applied to transform the simulated data sets into complete data sets, and coefficients in a linear LWI were estimated for each complete data set. For each proportion for each method, a bias measure was calculated by comparing the estimated coefficient values with values estimated from the reference data set. Results Methods performed differently depending on the proportion of missing data. For 1% to 30% proportions, low-rank approximation–based imputation, predictive-mean matching–based multiple imputation, and expectation maximization–based multiple imputation were superior. For 31% to 60% proportions, low-rank approximation–based imputation and predictive-mean matching–based multiple imputation performed best. For over 60% proportions, only low-rank approximation–based imputation performed acceptably. Conclusions Low-rank approximation–based imputation was the best of the 6 data-handling methods regardless of the proportion of missing data. This superiority is generalizable to other panel data sets comprising health behavior lifelogs given their verified low-rank nature, for which low-rank approximation–based imputation is known to perform effectively. This result will guide missing-data handling in reducing coefficient biases in new development cases of linear LWIs with panel data.


2018 ◽  
Vol 26 (4) ◽  
pp. 431-456 ◽  
Author(s):  
Kyle L. Marquardt ◽  
Daniel Pemstein

Data sets quantifying phenomena of social-scientific interest often use multiple experts to code latent concepts. While it remains standard practice to report the average score across experts, experts likely vary in both their expertise and their interpretation of question scales. As a result, the mean may be an inaccurate statistic. Item-response theory (IRT) models provide an intuitive method for taking these forms of expert disagreement into account when aggregating ordinal ratings produced by experts, but they have rarely been applied to cross-national expert-coded panel data. We investigate the utility of IRT models for aggregating expert-coded data by comparing the performance of various IRT models to the standard practice of reporting average expert codes, using both data from the V-Dem data set and ecologically motivated simulated data. We find that IRT approaches outperform simple averages when experts vary in reliability and exhibit differential item functioning (DIF). IRT models are also generally robust even in the absence of simulated DIF or varying expert reliability. Our findings suggest that producers of cross-national data sets should adopt IRT techniques to aggregate expert-coded data measuring latent concepts.


2018 ◽  
Vol 154 (2) ◽  
pp. 149-155
Author(s):  
Michael Archer

1. Yearly records of worker Vespula germanica (Fabricius) taken in suction traps at Silwood Park (28 years) and at Rothamsted Research (39 years) are examined. 2. Using the autocorrelation function (ACF), a significant negative 1-year lag followed by a lesser non-significant positive 2-year lag was found in all, or parts of, each data set, indicating an underlying population dynamic of a 2-year cycle with a damped waveform. 3. The minimum number of years before the 2-year cycle with damped waveform was shown varied between 17 and 26, or was not found in some data sets. 4. Ecological factors delaying or preventing the occurrence of the 2-year cycle are considered.


Sign in / Sign up

Export Citation Format

Share Document