A Causal Dirichlet Mixture Model for Causal Inference from Observational Data

Adi Lin; Jie Lu; Junyu Xuan; Fujin Zhu; Guangquan Zhang

doi:10.1145/3379500

Framework for identifying drug repurposing candidates from observational healthcare data

JAMIA Open ◽

10.1093/jamiaopen/ooaa048 ◽

2020 ◽

Author(s):

Michal Ozery-Flato ◽

Yaara Goldschmidt ◽

Oded Shaham ◽

Sivan Ravid ◽

Chen Yanover

Keyword(s):

Causal Inference ◽

Observational Data ◽

Prescription Drugs ◽

Domain Knowledge ◽

Multiple Testing ◽

Statistical Significance ◽

Drug Effects ◽

Drug Repurposing ◽

Design Parameters ◽

Medical Databases

Abstract Objective Observational medical databases, such as electronic health records and insurance claims, track the healthcare trajectory of millions of individuals. These databases provide real-world longitudinal information on large cohorts of patients and their medication prescription history. We present an easy-to-customize framework that systematically analyzes such databases to identify new indications for on-market prescription drugs. Materials and Methods Our framework provides an interface for defining study design parameters and extracting patient cohorts, disease-related outcomes, and potential confounders in observational databases. It then applies causal inference methodology to emulate hundreds of randomized controlled trials (RCTs) for prescribed drugs, while adjusting for confounding and selection biases. After correcting for multiple testing, it outputs the estimated effects and their statistical significance in each database. Results We demonstrate the utility of the framework in a case study of Parkinson’s disease (PD) and evaluate the effect of 259 drugs on various PD progression measures in two observational medical databases, covering more than 150 million patients. The results of these emulated trials reveal remarkable agreement between the two databases for the most promising candidates. Discussion Estimating drug effects from observational data is challenging due to data biases and noise. To tackle this challenge, we integrate causal inference methodology with domain knowledge and compare the estimated effects in two separate databases. Conclusion Our framework enables systematic search for drug repurposing candidates by emulating RCTs using observational data. The high level of agreement between separate databases strongly supports the identified effects.

Download Full-text

A Survey on Causal Inference

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3444944 ◽

2021 ◽

Vol 15 (5) ◽

pp. 1-46

Author(s):

Liuyi Yao ◽

Zhixuan Chu ◽

Sheng Li ◽

Yaliang Li ◽

Jing Gao ◽

...

Keyword(s):

Machine Learning ◽

Causal Inference ◽

Observational Data ◽

Causal Effect ◽

Research Direction ◽

Estimation Methods ◽

Potential Outcome ◽

Outcome Framework ◽

Benchmark Datasets ◽

Inference Methods

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy, and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well-known causal inference frameworks. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine, and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.

Download Full-text

Discriminative Learning Approach Based on Flexible Mixture Model for Medical Data Categorization and Recognition

Sensors ◽

10.3390/s21072450 ◽

2021 ◽

Vol 21 (7) ◽

pp. 2450

Author(s):

Fahd Alharithi ◽

Ahmed Almulihi ◽

Sami Bourouis ◽

Roobaea Alroobaea ◽

Nizar Bouguila

Keyword(s):

Mixture Model ◽

Hybrid Approach ◽

Medical Data ◽

Support Vector ◽

Discriminative Learning ◽

Learning Approach ◽

X Rays ◽

Intrinsic Nature ◽

Discriminative Models ◽

Dirichlet Mixture

In this paper, we propose a novel hybrid discriminative learning approach based on shifted-scaled Dirichlet mixture model (SSDMM) and Support Vector Machines (SVMs) to address some challenging problems of medical data categorization and recognition. The main goal is to capture accurately the intrinsic nature of biomedical images by considering the desirable properties of both generative and discriminative models. To achieve this objective, we propose to derive new data-based SVM kernels generated from the developed mixture model SSDMM. The proposed approach includes the following steps: the extraction of robust local descriptors, the learning of the developed mixture model via the expectation–maximization (EM) algorithm, and finally the building of three SVM kernels for data categorization and classification. The potential of the implemented framework is illustrated through two challenging problems that concern the categorization of retinal images into normal or diabetic cases and the recognition of lung diseases in chest X-rays (CXR) images. The obtained results demonstrate the merits of our hybrid approach as compared to other methods.

Download Full-text

High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2007.1095 ◽

2007 ◽

Vol 29 (10) ◽

pp. 1716-1731 ◽

Cited By ~ 107

Author(s):

Nizar Bouguila ◽

Djemel Ziou

Keyword(s):

Mixture Model ◽

High Dimensional ◽

Minimum Message Length ◽

Message Length ◽

Model Based ◽

Dirichlet Mixture ◽

Generalized Dirichlet

Download Full-text

A comparative study of design-based and analysis-based approaches to causal inference with observational data

Biostatistics & Epidemiology ◽

10.1080/24709360.2021.1992246 ◽

2021 ◽

pp. 1-10

Author(s):

Junni L. Zhang

Keyword(s):

Comparative Study ◽

Causal Inference ◽

Observational Data

Download Full-text

Random Forests Approach for Causal Inference with Clustered Observational Data

Multivariate Behavioral Research ◽

10.1080/00273171.2020.1808437 ◽

2020 ◽

pp. 1-24

Author(s):

Youmi Suk ◽

Hyunseung Kang ◽

Jee-Seon Kim

Keyword(s):

Causal Inference ◽

Observational Data ◽

Random Forests

Download Full-text

Mendel’s laws, Mendelian randomization and causal inference in observational data: substantive and nomenclatural issues

European Journal of Epidemiology ◽

10.1007/s10654-020-00622-7 ◽

2020 ◽

Vol 35 (2) ◽

pp. 99-111 ◽

Cited By ~ 8

Author(s):

George Davey Smith ◽

Michael V. Holmes ◽

Neil M. Davies ◽

Shah Ebrahim

Keyword(s):

Causal Inference ◽

Observational Data ◽

Mendelian Randomization ◽

Mendel’S Laws

Download Full-text

You Can’t Drive a Car With Only Three Wheels

American Journal of Epidemiology ◽

10.1093/aje/kwz119 ◽

2019 ◽

Vol 188 (9) ◽

pp. 1682-1685 ◽

Cited By ~ 1

Author(s):

Hailey R Banack

Keyword(s):

Causal Inference ◽

Observational Data ◽

Causal Effect ◽

Causal Effects ◽

Measurement Bias ◽

Exposure Misclassification ◽

Unmeasured Confounding ◽

Obstetrical Care ◽

The Impact ◽

Fundamental Requirement

Abstract Authors aiming to estimate causal effects from observational data frequently discuss 3 fundamental identifiability assumptions for causal inference: exchangeability, consistency, and positivity. However, too often, studies fail to acknowledge the importance of measurement bias in causal inference. In the presence of measurement bias, the aforementioned identifiability conditions are not sufficient to estimate a causal effect. The most fundamental requirement for estimating a causal effect is knowing who is truly exposed and unexposed. In this issue of the Journal, Caniglia et al. (Am J Epidemiol. 2019;000(00):000–000) present a thorough discussion of methodological challenges when estimating causal effects in the context of research on distance to obstetrical care. Their article highlights empirical strategies for examining nonexchangeability due to unmeasured confounding and selection bias and potential violations of the consistency assumption. In addition to the important considerations outlined by Caniglia et al., authors interested in estimating causal effects from observational data should also consider implementing quantitative strategies to examine the impact of misclassification. The objective of this commentary is to emphasize that you can’t drive a car with only three wheels, and you also cannot estimate a causal effect in the presence of exposure misclassification bias.

Download Full-text