Squeezing observational data for better causal inference: Methods and examples for prevention research

Diego Garcia-Huidobro; J. Michael Oakes

doi:10.1002/ijop.12275

A Survey on Causal Inference

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3444944 ◽

2021 ◽

Vol 15 (5) ◽

pp. 1-46

Author(s):

Liuyi Yao ◽

Zhixuan Chu ◽

Sheng Li ◽

Yaliang Li ◽

Jing Gao ◽

...

Keyword(s):

Machine Learning ◽

Causal Inference ◽

Observational Data ◽

Causal Effect ◽

Research Direction ◽

Estimation Methods ◽

Potential Outcome ◽

Outcome Framework ◽

Benchmark Datasets ◽

Inference Methods

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy, and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well-known causal inference frameworks. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine, and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

Download Full-text

Framework for identifying drug repurposing candidates from observational healthcare data

JAMIA Open ◽

10.1093/jamiaopen/ooaa048 ◽

2020 ◽

Author(s):

Michal Ozery-Flato ◽

Yaara Goldschmidt ◽

Oded Shaham ◽

Sivan Ravid ◽

Chen Yanover

Keyword(s):

Causal Inference ◽

Observational Data ◽

Prescription Drugs ◽

Domain Knowledge ◽

Multiple Testing ◽

Statistical Significance ◽

Drug Effects ◽

Drug Repurposing ◽

Design Parameters ◽

Medical Databases

Abstract Objective Observational medical databases, such as electronic health records and insurance claims, track the healthcare trajectory of millions of individuals. These databases provide real-world longitudinal information on large cohorts of patients and their medication prescription history. We present an easy-to-customize framework that systematically analyzes such databases to identify new indications for on-market prescription drugs. Materials and Methods Our framework provides an interface for defining study design parameters and extracting patient cohorts, disease-related outcomes, and potential confounders in observational databases. It then applies causal inference methodology to emulate hundreds of randomized controlled trials (RCTs) for prescribed drugs, while adjusting for confounding and selection biases. After correcting for multiple testing, it outputs the estimated effects and their statistical significance in each database. Results We demonstrate the utility of the framework in a case study of Parkinson’s disease (PD) and evaluate the effect of 259 drugs on various PD progression measures in two observational medical databases, covering more than 150 million patients. The results of these emulated trials reveal remarkable agreement between the two databases for the most promising candidates. Discussion Estimating drug effects from observational data is challenging due to data biases and noise. To tackle this challenge, we integrate causal inference methodology with domain knowledge and compare the estimated effects in two separate databases. Conclusion Our framework enables systematic search for drug repurposing candidates by emulating RCTs using observational data. The high level of agreement between separate databases strongly supports the identified effects.

Download Full-text

Policy evaluation using causal inference methods

10.4337/9781788976480.00019 ◽

2021 ◽

pp. 294-324

Author(s):

Denis Fougère

Keyword(s):

Causal Inference ◽

Policy Evaluation ◽

Inference Methods

Download Full-text

A comparative study of design-based and analysis-based approaches to causal inference with observational data

Biostatistics & Epidemiology ◽

10.1080/24709360.2021.1992246 ◽

2021 ◽

pp. 1-10

Author(s):

Junni L. Zhang

Keyword(s):

Comparative Study ◽

Causal Inference ◽

Observational Data

Download Full-text

Random Forests Approach for Causal Inference with Clustered Observational Data

Multivariate Behavioral Research ◽

10.1080/00273171.2020.1808437 ◽

2020 ◽

pp. 1-24

Author(s):

Youmi Suk ◽

Hyunseung Kang ◽

Jee-Seon Kim

Keyword(s):

Causal Inference ◽

Observational Data ◽

Random Forests

Download Full-text

Mendel’s laws, Mendelian randomization and causal inference in observational data: substantive and nomenclatural issues

European Journal of Epidemiology ◽

10.1007/s10654-020-00622-7 ◽

2020 ◽

Vol 35 (2) ◽

pp. 99-111 ◽

Cited By ~ 8

Author(s):

George Davey Smith ◽

Michael V. Holmes ◽

Neil M. Davies ◽

Shah Ebrahim

Keyword(s):

Causal Inference ◽

Observational Data ◽

Mendelian Randomization ◽

Mendel’S Laws

Download Full-text

Alternative causal inference methods in population health research: Evaluating tradeoffs and triangulating evidence

SSM - Population Health ◽

10.1016/j.ssmph.2019.100526 ◽

2020 ◽

Vol 10 ◽

pp. 100526 ◽

Cited By ~ 4

Author(s):

Ellicott C. Matthay ◽

Erin Hagan ◽

Laura M. Gottlieb ◽

May Lynn Tan ◽

David Vlahov ◽

...

Keyword(s):

Causal Inference ◽

Population Health ◽

Health Research ◽

Inference Methods

Download Full-text

You Can’t Drive a Car With Only Three Wheels

American Journal of Epidemiology ◽

10.1093/aje/kwz119 ◽

2019 ◽

Vol 188 (9) ◽

pp. 1682-1685 ◽

Cited By ~ 1

Author(s):

Hailey R Banack

Keyword(s):

Causal Inference ◽

Observational Data ◽

Causal Effect ◽

Causal Effects ◽

Measurement Bias ◽

Exposure Misclassification ◽

Unmeasured Confounding ◽

Obstetrical Care ◽

The Impact ◽

Fundamental Requirement

Abstract Authors aiming to estimate causal effects from observational data frequently discuss 3 fundamental identifiability assumptions for causal inference: exchangeability, consistency, and positivity. However, too often, studies fail to acknowledge the importance of measurement bias in causal inference. In the presence of measurement bias, the aforementioned identifiability conditions are not sufficient to estimate a causal effect. The most fundamental requirement for estimating a causal effect is knowing who is truly exposed and unexposed. In this issue of the Journal, Caniglia et al. (Am J Epidemiol. 2019;000(00):000–000) present a thorough discussion of methodological challenges when estimating causal effects in the context of research on distance to obstetrical care. Their article highlights empirical strategies for examining nonexchangeability due to unmeasured confounding and selection bias and potential violations of the consistency assumption. In addition to the important considerations outlined by Caniglia et al., authors interested in estimating causal effects from observational data should also consider implementing quantitative strategies to examine the impact of misclassification. The objective of this commentary is to emphasize that you can’t drive a car with only three wheels, and you also cannot estimate a causal effect in the presence of exposure misclassification bias.

Download Full-text